the missed opportunity of acid 3

Ian Hickson released his Acid 3 test a little while ago, and though there have been a couple of changes recently, it’s pretty much settled down in its final form. There’s been a lot of discussion of how different browsers do on the test, and it’s clear that some projects are really buckling down to get to 100 and the pixel-perfect rendering. You might ask why Mozilla’s not racking up daily gains, especially if you’re following the relevant bugs and seeing that people have produced patches for some issues that are covered by Acid 3.

The most obvious reason is Firefox 3. We’re in the end-game of building what I really do believe is the best browser the web has ever known, and we expect to be putting it in the hands of more than 170 million users in a pretty short period of time. We’re still taking fixes for important issues, but virtually none of the issues on the Acid 3 list are important enough for us to take at this stage. We don’t want to be rushing fixes in, or rushing out a release, only to find that we’ve broken important sites or regressed previous standards support, or worse introduced a security problem. Every API that’s exposed to content needs to be tested for compliance and security and reliability, and we already have some tough rows to hoe with respect to conflicts with existing content as it is. We think these remaining late-stage patches are worth the test burden, often because they help make the web platform much more powerful, and reflect real-web compatibility and capability issues. Acid 3′s contents, sadly, are not as often of that nature.

Ian’s Acid 3, unlike its predecessors, is not about establishing a baseline of useful web capabilities. It’s quite explicitly about making browser developers jump — Ian specifically sought out tests that were broken in WebKit, Opera, and Gecko, perhaps out of a twisted attempt at fairness. But the Acid tests shouldn’t be fair to browsers, they should be fair to the web; they should be based on how good the web will be as a platform if all browsers conform, not about how far any given browser has to stretch to get there.

The selection of specifications is also troubling. First, there is a limitation that the specification had to be suitably finished by 2004, meaning that only standards that were finalized during the darkest period of web stagnation were eligible: standards that predate people actively reviving the web platform, through work like the WHATWG’s <canvas> specification. While this did protect us from tests requiring the worst of SVG’s excesses (1.2, with support for such key graphical capabilities as file upload and sockets, was promoted to CR in 2007), it also means that it includes @font-face, a specification which was so poorly thought of that it was removed in 2.1. I can think of no reason to place such time-based restrictions on specification selection other than perhaps to ensure that there has been time for errata to surface — but in the case of CSS, those errata are directly excluded! Similarly, the WHATWG’s work to finally develop an interoperable specification for such widely-used things as .innerHtml were excluded. (If the test is about “fairness”, I could see not wanting to target specifications that were so new that people just hadn’t had time to catch up to them, but again I think that such a test should be built on more long-term criteria than lining up the starting blocks for a developer sprint. We could be using Acid tests as an “experience index” for the web, if you will.)

I also believe that the tests should focus on areas where the missing features can’t be easily worked around; SMIL’s capabilities are available to interested authors via an easy-to-use library, and if Hixie could stomach digging around in the SVG specification I wish he’d spent his time on things like filters or even colour profiles, which lacks are much harder to work around.

People who misread my disappointment or our lack of last minute Acid-boosting patches as a lack of commitment to standards would do well to study our history as a project. For more than a decade, we’ve been dedicated to support of standards, and improvement of those standards, even though it has often been painful to do so. In 1998 we threw away our entire layout engine in order to rebuild on one that provided, we believed, a better basis for the new standards of the day: DOM, CSS, etc. We’ve changed our implementations of <canvas>, of globallocalStorage, of JavaScript — all to track changes in standards, and to avoid conflicting with them. Last night, we disabled cross-site XHR because we aren’t certain what way the spec was going to go, and if, e.g., we ship something that doesn’t send cookies, we would constrain where the spec could go by building developer expectations about that behaviour. (These developer expectations about what sorts of requests can be triggered from cross-domain content are basically the entire reason that we need special work for cross-site XHR mashup technology.) We will fix standards compliance bugs; it’s what we do. But we won’t fix them all with the same priority, and I hope that we won’t prioritize Acid 3 fixes artificially highly, because I think that would be a disservice to web developers and users. Where Acid 3 happens to test something that we believe is important to fix, we will of course pursue it: surrogate pair handling or some of the selector bugs seem like good candidates.

Acid 3 could have had a tremendous positive effect on the web, representing the next target for the web platform, and helping developers prioritize work in such a way as to maximize the aggregate capabilities of the web. Instead, it feels like a puzzle game, and I can easily imagine the developers of the web’s proprietary competitors chuckling about the hundreds of developer-hours that have gone into adding another way to iterate over nodes, or twiddling internal APIs to special case a testing font. I don’t think it’s worthless, but I think it could have been a lot more, especially with someone as talented and terrifyingly detail-oriented as Ian at the helm.

Maybe next time.

[Update: the WebKit checkin to special-case Ahem wasn't one that used a private OS X API, it was one that used an internal CoreGraphics API on Windows; I should have been reading more closely, apologies to the WebKit folks.]