Search 75,747 tutors

What's this HTML5 thing?

Although it's not so new anymore, I still see "HTML5" being used as a buzzword - though, fortunately, by now it's becoming a bit more of a standard. So, what is it, and how does it differ from regular old HTML?

What most people have been using for awhile now is actually HTML 4.01; the other common one is XHTML, which is written much the same as HTML, but is based on another language called XML (for more information, check out For a long time, XHTML's more strict structure caused it to be heralded as a replacement for HTML. It was to become the top structural language of the internet, and in some places, it was even taught as HTML (whether to help with the transition or simply by mistake; they are that similar). Because of that, HTML was left at 4.01, and efforts were focused instead on the creation of XHTML 2.

So far, XHTML 2 isn't ready yet, despite being started back in 2007. So much effort was been put into covering so many things and making it so great that it became a very different beast, and the group working on it ultimately closed at the end of 2010 after the World Wide Web Consortium (W3C) decided in 2009 that the progress on HTML5 was looking a lot better. This doesn't mean that HTML 4.01 or XHTML are dead, of course - both are still frequently used.

So then, what is the difference between HTML 4.01 and HTML 5? Semantics, mostly. A few elements were retired, a few have been repurposed, and a number of new elements have been added. Most of these new elements may not seem like anything special at first; why use an "article" or a "header" when a div will do the same thing, but with the bonus of working correctly in older browsers (most notably, Internet Explorer older than version 9 must be told what to do with some of the new HTML elements)? Semantics. Just as browsers and search engines (SEO, anyone?) are supposed to recognize that h1 is more important than h2 or p, they can also differentiate (to a degree) the differences between a header, a footer, a paragraph, a quote, etc. These new elements can also help the developer figure out more quickly and accurately what part of a page they are looking at and working on, simply by being able to see "header," "footer," "article," and "aside," instead of line after line of nested divs with similar ids and 20 different classes.

Not all of them are just semantic tweaks, though. One element in particular, called "canvas," has been getting a lot of attention. Canvas has a certain similarity to img, in that it exists on the page, but it requires information of some sort from somewhere else to serve any purpose. For img, you must call in an image. For canvas, you must call in directions from JavaScript. With a mere JS document, you can turn a few typed characters into a presentation, a cartoon, a game, and more. It is because of canvas alone that HTML5 has been hailed as a Flash killer.

My personal favorite, though, is the doctype. (This may be partly due to my inexperience working with canvas.) In an HTML or XHTML document, the doctype tells the browser what version it is - and thus, how to treat the HTML and CSS for that page. Without a doctype, everything in your HTML document will still display, but it may not display correctly. Doctypes have long been a point of minor-but-constant annoyance though, because it meant tracking down the correct string of assorted characters and definitions that was too long and random-looking to memorize (XHTML was particularly bad about it), then copy-and-pasting it into your file. Now, though, HTML5 allows us to use a short, neat little piece ( ), and you can move on with your life. The bonus: you don't have to be using HTML5 to use that doctype. The simple fact that you have one is enough to tell browsers not to try displaying your site as if it was built in the 90's.

What it all boils down to: HTML5 is not magic, and it's still just HTML. It does have some pretty nifty details, though, and is absolutely worth trying.