Friday, June 6, 2008

The Next-Gen Web: HTML5 - Will We Ever See A Real Standard?

Cheers Techcruch for this insight on the state of technology in building the next web. Pretty techie stuff, but illustrates well how difficult the idea of standardization is to achieve... and this is only for APIs. Add in the human interpretation required for semantics to work and it's a pretty long road ahead...

----------

Last week we looked at how some browsers and plug-ins were adopting storage-related API’s that are a part of the new HTML5 draft specification. While Gears, Opera and WebkitMicrosoft, Adobe and others racing ahead with their own next-gen web technologies, will we ever see a real HTML5 standard? have implemented structured storage API’s, the remainder of the HTML5 spec currently remains mostly unimplemented and also in a state of flux. HTML5 is a super-sized effort to bring all the browsers under a single, standard markup language and set of API’s - but with

Learning From History

netscape In terms of the scope and effort, the HTML5 effort has an earlier historical analogy in the HTML 3.0 spec. Back in April of 1995, the HTML 3.0 spec was drafted as a backwards-compatible way of adding new features (such as tables) to HTML 2.0. The W3C had only just formed, and HTML 3.0 was one of the first specs to be produced by the new working group. At the time the browser wars were just around the corner, as Navigator had been out for only five months and had already built up 80% market share. Microsoft had taken notice and were rushing out Internet Explorer 1.0 which would be released a few short months later.

As it remains today, in 1995 the different browsers all supported a different set of markup. With their new 1.1 release, Netscape had raced ahead and implemented tables, floating images, and other navigational elements (such as visited links). IE 1 was a complete hack of a browser that had an approach of rendering at all cost, meaning that if it couldn’t work out what the user had intended with the HTML, it would do its best to have a guess and present anything. This resulted in issues such as being able to mix tags (eg.

Header

) which allowed developers to be lazier as IE would compensate for mistakes.

With the market share of Internet Explorer steadily rising, and with frequent point releases and updates from both Netscape and Microsoft, the two browsers steadily diverged further as the market was also segmented into two firm camps. The HTML specification effort, which had previously taken the form of RFC’s, was supposed to re-unite the browsers and formalize new features that browsers had already introduced. There was often significant tension amongst contributors to the spec about which browser, Netscape or Explorer, had a better implementation of each new feature. For example, Netscape and Explorer had very different approaches to image maps, where they were not compatible with one another. Microsoft were also responsible for making up random HTML tags, such as and to define static areas of a page (which would later become the very unfriendly frameset tags thanks to Netscape).

The problem was not that these new features were already out in the wild, but that there were two fiercely competitive products each implementing their own version of the web in order to either protect their market share or to gain control of more of it. Eventually both Netscape and Microsoft gave up on implementing a proper HTML 3.0 spec, for example from Netscape:

Netscape remains committed to supporting HTML 3.0. To that end, we’ve gone ahead and implemented several of the more stable proposals, in expectation that they will be approved. We believe that Netscape Navigator 2.0 supports more of the HTML 3.0 specifications than any other commercial client.

In addition, we’ve also added several new areas of HTML functionality to Netscape Navigator that are not currently in the HTML 3.0 specification. We think they belong there, and as part of the standards process, we are proposing them for inclusion

and Microsoft were left playing catchup in terms of supporting HTML:

Netscape has enjoyed a virtual monopoly of the browser market (about 90% according to some estimates), and this has allowed it to consolidate its position still further by introducing unofficial or ‘extended’ HTML tags. As a result, the Web is littered with pages that only work effectively if viewed in Navigator. By the time other browsers catch up, Netscape has made even more additions.

but that didn’t last long and Microsoft tired of playing that game. Further releases didn’t even mention HTML anymore and instead talked about a web built on Microsoft technology:

Microsoft Internet Explorer 3.0 is the first Internet client to integrate ActiveXTM technologies, which enable developers to create highly interactive applications and content for the Internet. These technologies allow a World Wide Web site to be as rich and interactive as an action game, a multimedia encyclopedia or a productivity application. For the first time, a Web site will be limited only by its author’s imagination, not by the limitations of the technology.

In a very quick year the browser wars had progressed from fighting over HTML tag support and towards the formats and languages that would produce richer client-side applications. The battle between Javascript (the Netscape proprietary client-side scripting language) and ActiveX (the Microsoft proprietary object container) was just around the corner with the release of Internet Explorer 3.0 in August of 1996.

The rest of the story where Microsoft wins, and more importantly, how they won, the browser war is common history. The web had fractured in a big way, with repercussions that would last for over a decade as thousands of developer hours go to waste producing cross-browser hacks and libraries. Despite Microsoft gaining dominance in the browser market and promoting multiple tiers of proprietary technology for building web applications, somehow simple HTML, Javascript and CSS eventually won over and Web 2.0 wasn’t built on ActiveX.

Fast Forward Ten Years

While Netscape has disappeared and been replaced with Firefox, the battle for the web today is not only between browsers but also one between new web platforms and technologies. The market share of Internet Explorer has by some estimates been notched down to 78% (from a high in 2004 of 95%), with Firefox at 16% and Safari, Opera and others making up the remaining 6%. HTML 4.01 was published in December of 1999 and went on to become an ISO standard as the major browsers built in support for the spec. HTML 4.01 still remains the most widely and best supported HTML standard, but the problems today have migrated to other parts of the web technology stack, specifically with CSS and DOM access.

In what is now referred to as Web 2.0, thousands of rich web applications have been developed using HTML, CSS and XML - more commonly referred to as Ajax (ironically the a and x parts of Ajax started as a proprietary add-on to Internet Explorer in the form of xmlhttprequest). Ajax applications quickly reached limitations of what can be done with current technologies, but they had shortened the gap between desktop and web applications. A number of vendor-backed web client platforms such as Flash from Adobe and Silverlight from Microsoft have been released as a layer above the browser, presenting developers with a very rich desktop-like development environment for web applications. These new platforms work by extending existing browsers through plugins, and while these commercial solutions have already launched there is currently no suitable open source and open standards based alternative that extends beyond Ajax.

Frustrated by the lack of progress with HTML5 at the W3, a group of browser developers split off and formed WHATWG to further develop the specification. The primary mission of HTML5 was to recognize that the web has changed since the original HTML specs, as web applications were now capable of presenting very complex user interfaces and could make use of more advanced system functions (for the interface, Silverlight uses XAML while Flex/Flash uses MXML). The spec began as Web Applications 1.0, which was an umbrella term to describe not only the new HTML5 spec but other associated specifications such as CSS2, DOM5, ECMAv4 and new API calls (such as local browser storage).

The WHATWG working group spec was eventually (after 4 years) folded back into W3, and Microsoft joined the effort again. In the interim, developers searching for a rich web app platform beyond Ajax had little option other than to join either the Microsoft or Adobe universe. Progress on implementing the HTML5 spec was still very slow, until GoogleGears. Gears is Google’s way of hurrying up implementation of HTML5 features in browsers, and they have backed it at each step by having their own applications such as Gmail and Reader recognised the threat of a Microsoft or Adobe dominated web and stepped in by creating support the new API calls.

Apple is another company who are fully backing the open, HTML5 alternative for rich internet applications. It was only a few years ago that a visitor to the Apple homepage would find a page dominated by Flash and PDF files. Today Apple have their own open-standards based browser with Safari and back the Webkit open source project. They have also backed up their support for both the free and open alternative by re-engineering their websites and applications to use Ajax over proprietary alternatives such as Flash.

We are back in 1996 again and HTML5 is the new HTML 3.0, but instead of two major browser manufacturers today there are numerous parties with interest in determining what the new web API and virtual machine will look like. In the 1990’s version of events, the open standards eventually won over - which both Microsoft and Adobe have recognized as they have released source code and API details for some parts of their platforms.

Web history teaches us that there is usually a single winner, as all users steadily migrate to the single winning solution which imposes itself as a standard (recall that many of today’s ’standards’ began life as proprietary technologies). There is a big difference though between a standard such as the Windows operating system, and an open standard such as HTML5 - and a repeat dose of the former is the biggest threat that companies such as Google and Apple currently face.

current-web-tech

You can read the previous Next-Gen Web post about local browser storage here

No comments: