On the Web standards process

The debate over net neutrality is usually portrayed as taking place in national and supranational governments — the US government, the European Union.

A far more important battle on much the same terms is going on behind the scenes, though. Due to lack of attention, the big business side is slowly winning.

The battle is over the specifications and standards for the technologies that make the World Wide Web work, in particular HTML, the computer language and associated programming tools which make the web work. Since the late 2000s, the job of designing HTML and adding new features to it has fallen to the Web Hypertext Application Technology Working Group (WHATWG), a ‘community of people interested in evolving the Web’ founded by Apple, Mozilla, and Opera in 2004. These people are the parliament and government that make the laws that decide how web browsers should work.

The WHATWG’s charter lays down a nice hierarchy of whose needs standards-makers should consider when laying down the law: “In case of conflict, consider users over authors over implementors over specifiers over theoretical purity.” That means that people who use the web, in theory, have the most control over how it works, which sounds very fair and democratic.

Unfortunately, nothing could be further from the truth. In reality, a few large companies have great, perhaps total control over the making of web standards, and their interests have become aligned to a sufficient extent that dissent has become meaningless.

For a start, even though they’re supposedly the most important group in this ‘priority of constituencies’, most everyday web users have little interest in the arcane technical debates that go on behind the scenes of the web standards process, so their input into the process is effectively nil.

That leaves the real top spot to web authors, who have to use the technologies specified to publish their sites. The use of the word ‘authors’ calls to mind the earliest days of the web, when most websites were made by individual, usually writing (‘authoring’) pages in HTML code entirely by hand and putting them online under their own address. That homegrown, do-it-yourself web has now faded into obscurity. Today ‘authors’ means, essentially, mega-corporations like Facebook, Google, Netflix, Amazon, and Twitter. What these web authors want is diametrically opposed to the needs of users. Independent web authors tend to have pretty much the same interests as web users — they are, after all, web users themselves — but their voices are drowned out by the weight of the giant business interests who really end up running the show. The big corporations want more ways to track and silo their users, and more ways to create their own personal fiefdoms and prevent a resurgence of the independent web.

Next in the theoretical food chain come ‘implementors’ — those are the browser makers like Microsoft, Apple, Google, Opera, and Mozilla. Firstly, notice that one of the biggest ‘authors’ is also one of the biggest ‘implementors’: Google. Google also has effective control over at least Mozilla, because Mozilla is essentially financed through the royalties they receive in exchange for Google being the default search engine in Firefox, and over Opera because their browser now uses code maintained by Google as its engine; the others (except Microsoft, which has Bing) are also vulnerable to the possibility of Google withdrawing these search royalties, though it has not yet made any move in that direction.

Specifiers tend to like theoretical purity but the current generation is, at least, very diligent about sacrificing it when needed for the other constituencies.

Unfortunately that is outweighed by the fact that the real priority of the main constituencies is essentially entirely reversed from that which is laid down as a ‘design principle’ of the web. The WHATWG’s leadership (unsurprisingly dominated by browser vendors) has constructed a very effective series of barriers to anyone from the supposedly-more-important constituencies contributing to the furtherance of the web platform. If you are an ordinary writer or publisher on the web and want to change something about how the world wide web works, here is how they will stop you from doing it:

First they will tell you that you are not following the process properly, and refuse to consider your proposal until you phrase your request in exactly the right way. If you succeed in doing that, they will state that your proposal is not part of their vision for the web platform, or that your problem is not their problem. If you try and overcome that by showing them a large group of people (users and independent authors) who agree with your proposal, you will finally be told that there is no way that browser makers could be convinced to implement your proposal, and that it would not be in their interests to add it as a feature no matter how many people have requested it.

That is as far as I have ever managed to get through the process so far. I’m not aware of anyone having got any further — and more to the point, I’m not aware of any significant new HTML feature which came from anyone other than a specifier, a browser vendor, or a large corporation. There are independent authors trying to make the case for new Web platform features — the mailing list archives are full of their proposals being made and then immediately shot down.

The most insidious example of a new feature pushed by big business is Encrypted Media Extensions, a euphemistic name for what has become more accurately known as HTML DRM — which forces browser makers to put a secret closed-source module into their browsers if they want to allow their users to use popular online video services. The major player in pushing for EME was Netflix, which wanted to be able to stream movies and prevent them being pirated. The BBC, too, invested licence fee money in encouraging and contributing to the development of the DRM features to prevent downloading of iPlayer content. Google also supported the proposal — although it has yet to apply DRM to YouTube videos, preferring to combat downloading in other ways. More than any other vendor, Google had the ability to kill this feature off for the good of the web by refusing to implement it, yet chose to support it — indeed they actively contributed to its development.

The public backlash was huge, led by the Electronic Frontier Foundation (the heavyweight US campaign group for information freedom and privacy rights) and the Free Software Foundation. DRM is the definition of a user-hostile feature: its only purpose is to prevent people doing what they want with their own computers, in case the things they want to do interfere with the interests of big business. Any perfectly legal activities — and there are plenty — which are also prevented by DRM technology are collateral damage. With DRM, you no longer have control over your own computer.

To be fair to the WHATWG, they did not develop EME — they left that to the more openly corrupt World Wide Web Consortium (W3C), founded by Sir Tim Berners-Lee. Unlike the WHATWG, membership of the W3C is only open to corporations, not to individuals, and it conducts much of its work in secret. Together, the W3C and the WHATWG have an effective duopoly over the politics of the open web: the WHATWG the slightly more open organization which is nonetheless spineless and toothless against powerful moneyed interests; the W3C openly discriminating against independent websites and protecting the interests of big business, with sidelines in arcane academic research of little practical use (it develops the Semantic Web, which fifteen years ago was supposed to be the next big thing but largely failed in its goal of making data open and machine-readable — today it’s mainly used as a full employment measure for computer science professors) and in the relatively benign job of specifying CSS, the language that makes it possible to add different colours, fonts, and layouts to HTML pages.

What this uneasy alliance of the supposedly ‘open’ web with powerful business interests will lead to in the future is unclear. Thus far EME is about the only development which is actually the polar opposite of what users wanted. It’s not inconceivable, however, that it could go deeper. Both Google and Facebook depend on advertising for their revenue. Today, adverts on web pages are not distinguished from any other kind of content, in the sense that there’s no way for a computer program to reliably tell from the code what parts of a page are adverts and what parts are regular content — only the human eye could notice, once the page is displayed on screen. Ad blockers usually work by picking out specific large websites and advertising networks and specifically targeting their code, which is easily identifiable. That’s why they don’t usually block anything on small websites which sell their own ads instead of using e.g. Google ads.

If a feature were added to HTML that let adverts be specifically marked up as such in the code, it could be used to make a ‘perfect’ ad blocker which would make online advertising worthless. That’s the reason which has thus far been given for not adding such a feature, even though it would bring a huge benefit to screen readers and other assistive technologies for the disabled. But equally, a feature like that could be set up to prevent ad blockers, or any other user code, removing or hiding those ads. More insidiously, it could be used to prevent privacy shield software from disabling tracking code and other ways ad networks follow you around the web to build up profiles of your behaviour. This corporate surveillance is, thanks to the power of intelligence agencies (largely still in place, and even extended, since the Snowden leaks in 2013), effectively an extension of government surveillance.

Again, so far, there’s been no definite indication that that’s where things are headed. But the existence of DRM for video in HTML already provides one stepping-stone in that direction — all that would be needed would be to extend that DRM to text and images as well. A possible further overture in this direction is WebExtensions, an effort to make browser plugins more compatible across different browsers by making them use the same programming interfaces — interfaces whose definition is essentially controlled by Google, which originally developed the standard.

What the W3C and WHATWG choose not to work on is equally as worrying as what they do work on. Since the technology world was rocked by the Snowden revelations in 2013, many individuals and smaller firms (as well as some larger ones like Apple which don’t depend on advertising revenue to make money) have been concerned about their privacy online. Individual website owners have been switching their websites over to use encrypted connections, which were previously only used by banks, online shops, and other sites handling sensitive financial information. Some have created new ways to combat surveillance, like WhatsApp’s end-to-end encryption of messages. Yet the W3C and WHATWG have made no major changes to the web platform to prevent tracking users across websites.

This control of large corporations over the web is a comparatively recent development. In 2001, for instance, Microsoft — whose control over the web was essentially absolute, since its Internet Explorer browser was then used by over 90% of surfers — tried to push a proprietary system called smart tags as a de facto web standard, which would have allowed Microsoft to change the content of any page on the Web. Web authors balked. The largest web publishing companies were still, at that time, quite small: the social media revolution had yet to happen; online video was still limited to tiny QuickTime or RealPlayer (!) clips; there was still no market share majority for any search engine (indeed, Yahoo was still beating Google until 2003). Microsoft backed down before the technology even shipped to the public. They had moved in too soon to conquer what was still a new medium.

Since 2001, though, things have changed. The browser space has opened up, with no browser manufacturer having such a total monopoly any more as Microsoft did in the early 2000s. Things appear more open, yet the emergence of a set of powerful aligned influences (Apple, Netflix, Amazon, and Google sharing the interests of Hollywood executives in protecting online video from piracy, whatever the consequences for user freedom; Google, Facebook, and Twitter in building user profiles for advertising, whatever the consequences for user privacy) has caused the decline of the independent web and an effective stranglehold not only over the present state of the web, but over its future too.

What can we do about this?