it’s full of bits

Deb’s excellent post about Firefox 3’s bookmarking system hit Digg today, on our shared server, which reminded me that I needed to install some WordPress caching software.

big network spike around 9AM Eastern

No sweat; wp-super-cache, I thank you.

year of the Gecko

Stuart put up a great post today describing the results of our intensive focus on memory use in Firefox 3 (and followed up, after many requests from commenters on his blog and elsewhere, with a graph including Safari and Opera). The memory gains are great, and they cover all sorts of improvements: leak fixes, allocator changes, new facilities to eliminate classes of troublesome entrainment, and better cache management.

It’s a time-honoured programming tradeoff that using more space speeds you up, but that’s not what happened here: our memory-reduction regimen actually made us faster in a lot of cases by making us more cache-friendly and by side-effects like using a better allocator. And we didn’t stop there, dropping the hammer on major performance gains in rendering and JavaScript as well, and leaving us as of today right at the top of tests like Apple’s SunSpider.

Productivity and feature wins in Firefox-the-application are really coming together as well, with the AwesomeBar leading many people’s lists of favourite new feature. It really has changed the way I use the web, and I feel like everything I’ve ever seen is right at my fingertips. Add to that the great strides in OS integration and theming for Mac and Linux and it really is shaping up to be the best browser the web has ever known.

I’m obviously excited; this feels like exactly the right sort of everything-coming-together that should be in the air on the cusp of the 10th anniversary of the original source release. It hasn’t been an easy ride, especially pre-Firefox, and nobody on the project takes our success so far for granted — which makes it all the more satisfying to see years of investment pay off in a fantastic product.

Other people are excited too, from users and journalists to extension developers and companies looking to add web tech to their products. In the mobile arena especially we’re seeing a ton of excitement about the gains in speed and size. A lot of people aren’t yet used to thinking of Mozilla as a source of mobile-grade technology, but they weren’t used to thinking of us as a major browser force either. It’s fun to break the model.

Fast, small, cross-platform, industry-leading stability, solid OS integration, excellent standards support, excellent web compatibility, great security, ridiculously extensible, a productive app platform, accessible, localized to heck and back, open source from top to bottom: it’s a great time to be building on top of Gecko, and Firefox 3 is just the beginning. Wait until you see what we have in store for the next release…

why update add-ons now?

With Firefox 3 still a couple of months away, it would seem reasonable to wonder why we’re encouraging add-on developers to get their add-ons updated for Firefox 3 already. For most add-on developers, it will indeed be a pretty quick process to update to the new chrome layout, a new API or two, and test it out, but we want people to start on that process now nonetheless. There are two reasons for this, in my mind:

  • The kinds of people who test our betas and give us great feedback are the kinds of people who have a bunch of extensions installed, and not having their favourite extensions work makes it much less pleasant for them to do in-depth testing.
  • If there is a hard problem found when updating an add-on, we want to know about things we can do on the Firefox side to make it easier, in time for those changes to safely get into the release stream. Waiting until the Firefox RCs are out would mean that we have a lot, lot less room to maneuver when it comes to resolving any problems found.

So please, take a moment to start updating your add-on this weekend, and let us know if you need help. Operators, in the Special Forces sense, are standing by.

leaking, growing, and measuring

(This post started small, but got bigger as I noticed more things that aren’t necessarily as obvious to my readers as they are to me, with respect to our process and software. So it grew over time, oh ha ha! It’s almost 1AM, so I will not be editing it further this evening! I might post a summarized version at some point in the future, or I might not.

And then I edited it because Dave pointed out that it sounded like I was saying that other browsers necessarily suffered similar fragmentation woes, which wasn’t my intent. Indeed, the main point of the post is that there can be many possible causes for a given symptom, and that the popular theories (e.g. “massive memory leaks”) may not prove correct.)

I’m going to share some non-news with you: Firefox has memory leaks. I would be shocked to discover that there were any major browser that did not have memory leaks, in fact. Developers in complex systems, be they browsers or video games or operating systems, fight constantly against bad memory behaviours that can cause leaks, excess usage, or in the worst cases even security problems.

(As an aside, it’s still quite, quite common to read articles which reference this long-in-the-tooth post from Ben as the “Mozilla development team” denying that there are leaks in Firefox. You would have a hard time getting any developer to say that there are no leaks in Firefox, and indeed the post in question says second sentence that Firefox has leaks. You do not need a secret nerd decoder ring here to interpret the text, just basic literacy. Also, it’s no secret that Ben hasn’t been active in Firefox development for quite some time, so for people to point at an article that’s thinking hard about what it would like for its second birthday, rather than actually contacting any of the rather visible and accommodating developers of today — well, it just feels kinda sloppy to me.)

So, Firefox has leaks, and Firefox uses a lot of memory in some cases. A student of logical fallacy will no doubt have no difficulty setting development priorities: to reduce the amount of memory used by Firefox, fix all the leaks. In this case, though, a student of Mencken can happily triumph over the student of fallacy, for even with multifarious leak fixes we would still see cases where Firefox’s “used memory” was quite a bit higher than leaks could account for.

Let me now take you on a journey of discovery. Measuring leaks — contra identifying their root causes or fixing them — is actually quite simple: you count the total amount of memory that you ask the operating system for (usually via an API called malloc), you subtract the amount of memory that you tell the operating system you’re done with (usually via free), and if the number isn’t zero when your program exits, you have a leak. We have a ton of tools for reporting on such leaks, and we monitor them very closely. So when we see that memory usage can go up by 100MB, but there are only a few kilobytes leaked, we get to scratching our heads.

Schrep, our intrepid VP of Engineering and sommelier, was doing just this sort of head-scratching recently, after he measured some surprising memory behaviour:

  • Start browser.
  • Measure memory usage (”Point 1″).
  • Load a URL that in turn opens many windows. Wait for them to finish loading.
  • Measure memory usage (”Point 2″).
  • Close them all down, and go back to the blank start page.
  • Measure memory usage again (”Point 3″).
  • Force the caches to clear, to eliminate them from the experiment.
  • Measure memory usage again (”Point 4″).

You might expect that the measurements at points 1 and 4 would be the same, or at least quite close (accounting for buffers that are lazily allocated on first use, for example). You might, then, share the surprise in what Schrep found:

Point 1Point 2Point 3Point 4
35MB118MB94MB88MB

(You can and should, if you care about such things, read the whole thread for more details about how things were measured, and Schrep’s configuration. It also shows the measured sizes for a number of browsers after this test as well as at startup with some representative applications loaded. You may find the results surprising! Go ahead, I’ll wait here!)

So what does cause memory usage to rise that way, if we’re not leaking supertankers worth of memory? Some more investigation ruled out significant contribution from the various caches that Firefox maintains for performance, and discovered that heap fragmentation is likely to be very significant contributor to the “long-term growth” effects that people observe and complain about. Heap fragmentation is a desperately nerdy thing, and you can read Stuart’s detailed post if you want to see pretty pictures, but if you’ve ever opened a carefully packed piece of equipment and then tried to put it all back in the box, you’ve experienced something somewhat similar; if you take things out and put them back in different orders, it’s hard to get every thing to fit together as nicely, and some space gets wasted.

The original design for Gecko placed an extremely high premium on memory efficiency. The layout code is littered with places where people did extra work in order to save a few kilobytes here or there, or to shave a few bytes off a structure. If you compute the classic malloc/free running total I mentioned above, I think you’ll find that Gecko typically uses a lot less memory than competitors. But, as I hope I’ve made at least somewhat clear here, there’s more to managing the memory impact of an application than simply balancing the checkbook and keeping your structures lean. When and how you allocate memory can be as-or-more important in determining the application’s “total memory footprint” than the things that are simple to theorize about. And making sure that you’re measuring the same things that users are seeing is key to focusing work on things that will be the maximum benefit to them, in the shortest time. We’re working now on ways to reduce the effects of heap fragmentation, just as we’ve invested in fixing leaks and improving our tools for understanding memory consumption and effects, and the outlook is quite promising.

The real punch line of this for Firefox users is that Firefox 3 will continue to improve memory behaviour over long-term usage, and you’ll soon be able to try it out for yourself with the upcoming Firefox 3 beta. Beta 1 won’t have the benefits of the work on fragmentation reduction, but many testers are already reporting dramatically improved memory consumption as well as significant performance gains. We’re never satisfied with the performance of Firefox, just as we always seek to make it more secure, more pleasant to use, and nicer to smell.

relevance, your honour?

The search engine business is a tough one. People are generally pretty bad at knowing how to phrase queries to give them what they want, to say nothing of dealing with spelling mistakes and synonyms and stemming, and you have to do all that work basically instantaneously. The relevance of search results might be the only thing more important than performance in determining if users will stick with your particular product, or make the trivial switch to another one.

So I was pretty surprised to discover how, er, idiosyncratic the search results were on Live Search for what I — perhaps naively — think of as a pretty straightforward query.

When searching for “Firefox”, the user might want to find the home page for the product, or a description of the history of the project, or maybe even a review of the software. Both Yahoo and Google give you some mix of that, with what seem to me to be pretty reasonable orderings of results.

The Live Search results are a little more difficult for me to understand, since they have the Silverlight developer FAQ as the first result, then an article about cross-site scripting, then an article about ASP.NET, and then the Wikipedia page about Firefox. You have to go to the 8th entry to get the product’s home page, well below the fold on my machine at least. I’ve saved off the results, in case you disbelieve me, or for some reason can’t reproduce them yourself.

Maybe Live Search users really are a different breed, if that’s what they would be most likely to want when searching for Firefox; a ballsy market-differentiation move by Microsoft, if so.

(Canadians don’t call their judges “Your Honour”, and Americans don’t spell honour that way, so the title of this post is a somewhat impossible reference, but I figure you’ll let that slide.)

justin timberlake is a web data ninja

Someone made the mistake of asking, and I couldn’t find the old email I wrote on the topic, so I’m going to inflict upon you how I think that the whole dealing-with-web-data should probably break down in terms of information flow. My thinking here is heavily influenced by a very popular treatise on problem deconstruction, of course.

Step the first: you cut a hole in the page.

To work with page data, we need to find the data. We can do that using heuristics (like various webmail systems do to identify dates for calendar integration, or the auto-linkifying of URLs that is so common), using explicit metadata like microformats, or even letting the user select something on which to focus our data-detection powers.

Step the second: you put the data in the box

Once we’ve found and analyzed the data of the moment, we probably want to bin it (in the statistics sense, not the adorable British accent sense) into a broad classification like “date”, “place”, “person”, “photo”, “event”, etc.

Step the third: you give him or her the box

Once we’ve determined the type and value of the datum in question (I like to use words like “datum” to cover up my insecurity about a lack of academic credentials), we can then present it to the user so that they can send it to a web service, poke a helper app, turn it into HTML on the clipboard for them to paste in their blog, annotate the page in their Places store with the juicy tidbits. The work on improved content handling in Firefox 3 will give us some useful primitives here, I have reason to hope.

Epilogue

Every now and then, someone asks me “what are microformats? why do people use them? do they smell nice?” My first instinct is to say “well, google can tell you that” but it turns out that it’s really pretty likely that you’ll end up on microformats.org, where they will tell you this:

Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.

There is then a link to “learn more” about microformats, by which I think a reasonable person might assume that they mean “learn anything“, because that description is sort of equivalent to describing Firefox as “a piece of software that is built from C++ and JavaScript”.

But then, I don’t think that talking about microformats specifically is really the right way forward, and I think that the microformats dudes would agree that microformats are a means to an end.

AMO and the quality bar

addons.mozilla.org has long occupied a special place in the Firefox software ecosystem. It’s the only site in the installation whitelist by default, the default server contacted for update information about add-ons, and where we send users who are looking for hot add-on leads.

That unique position means that there is a lot of value for some add-on developers in being hosted on AMO. Such hosting involves a review process, which I think both reviewers and developers alike would agree is one of the most frustrating parts of the whole system. The intent of the review process is entirely on the side of the angels: help make sure that add-ons are good for users.

The devil, of course, is in the details here. At times, the review bar has been placed entirely too high, in my opinion: otherwise-fine add-on updates rejected because they cause a strict warning to appear in the JS console, for example. In other cases, we’ve had add-ons approved which send some data to a central server, but don’t have a privacy policy listed. The most common and burdensome cases of this latter example tend to be associated with “toolbar-building” services: the ostensible authors of the resultant toolbars typically know very little about what’s being collected or how it’s being managed, which makes for a predictably unsatisfying conversation with reviewers.

(There are other elements of the review process that are inconsistent and difficult, mostly related to needing to reject items for errors in things that the add-on authors can change after the fact without review, but which can’t be helpfully fixed by the reviewers. These are the “easy” implementation artifacts, though, and not really the topic of this post.)

The trade-offs here are painful: adding a standard of “usefulness” or “implementation quality” to the checklist will not only dramatically slow the review process and require more specialized skills among our reviewers, but will also increase the variability between different reviewers’ decisions. Those are all things that I don’t think we can afford to make worse, and both the history and special position of AMO make me tend towards a much more laissez-faire position: if the description accurately describes what the user will get when they install it, especially as far as the collection and management of private information is concerned, then I think we should let the user make the decision about whether they consider the functionality useful. Some popular add-ons duplicate functionality that is already present in the browser, such as preference settings, adding only an alternate means of accessing it, for example, so requiring “significant new functionality” seems to work against the interests of a fair number of users.

At the same time, of course, I think it’s quite desirable to be able to point users at a more “filtered” view of the enormous add-ons space hosted on AMO. We currently have one such view, the recommended list, but that’s not really much of a solution to the broader problem. (It doesn’t try to be, really.)

A minimum rating threshold would be one way to narrow the default search results returned to a user, though it depends on the reliability and resilience of a rating system. Our current one isn’t sufficient to prevent the sort of gaming and distortion that would plague us in such a world, but that’s not to say that a sufficiently robust one couldn’t be developed. (Not “perfectly robust”, mind; just enough to keep the damage well below the gain.)

A simpler system would simply provide a single piece of metadata that could be set by reviewers or administrators using their judgment and likely via some multi-reviewer discussion. This wouldn’t scale as well as the universal rating by users, but would be more resistant to gaming and abuse (and easier to track and remedy if such nefariousness is detected).

This post is already too long, but you can read and write more about various possibilities for rating and approval schemes in the Remora Idea Dump. We’re thinking about and working on ways to help users find good add-ons, in a way that scales across our community, and I suspect it’s something that we’ll be working to improve for some time!

fertilizer

Mitchell posted earlier about my new focus: our developer ecosystem, and helping people produce great new tools and experiences on top of Firefox and the web both. It’s work that lets me combine technology, communication, and helping people solve their problems, and if I end up being even a fifth as good at it as I am excited about it — well, I’ll be really good, that’s what!

One important part of Mozilla’s support for developers in their work with Firefox and the web is the Mozilla Developer Centre Center, and I’ll be working with Deb and Eric to help MDC grow and thrive. In just over a year, MDC has developed a strong community of contributors and a great base of documentation, so I consider my job here to be helping Deb execute, and staying out of her way. (She is modest about it, and truly MDC is a fantastic example of the leverage that our community represents — and I include web developers in that community, very much — but Deb’s work to catalyze and guide and generally be MDC’s “guiding star” is not to be underestimated.) There are things to be fixed and problems to be solved, to be sure, and anyone who’s worked with me before knows that I can’t help but try to help when that’s the case, but the course we’re already on is very promising.

(As an aside of sorts, the recent newsgroup re-re-organization is a problem to which I owe a karmic debt, and I’ll post about that here and there this week, hopefully today.)

A bigger part of what I’m going to be working on, though, is what my favourite MBA calls “the extensions space” (my favourite trapeze artist would call it “the extensions piece”, I think). Working tirelessly, though again with an energetic and powerful community, Mike Morgan has been driving addons.mozilla.org through growing pains and scaling demands — popular stuff is hard! — and policy grey areas and likely some fire-breathing sharks or something too. He thinks deeply about the risks and hard decisions that we face as we try to make extensions — or, more broadly, a personalized web experience — attractive and appropriate for a broader portion of our users, and the users we don’t yet have. Working out a strategy for how to fit extensions into our product plans, how to help extension developers be even more productive and successful and happy, and how to maximally leverage the power of our platform, community, and brand to the benefit of the Web at large is an enormous and, I admit, somewhat daunting challenge. I look forward to drawing on my Mozilla knowledge, impeccable taste, and, especially, the experience and wisdom of people like morgamic to improve this part of our world materially. And I look forward to doing it very soon: while there are definitely long-term projects that deserve our attention, I’m starting to believe that there are some small (hopefully!) but significant changes that can make a positive change in the rather near future.

I’m trying to avoid letting “write a thorough and Frank-worthy post” be the enemy of “write a useful and, you know, posted post”, or something like that, so I think I’ll stop here. I want to thank everyone who has already sent me their (varied, and thought-provoking) thoughts on what’s good and bad today in with our world of extensions, and apologize pre-emptively for what will no doubt be rather tardy replies. I have a lot to absorb here, and nobody is bothering to ask easy questions.

echo reply

By now, everyone and their brother has reblogged Darin’s post about experimental support for <a ping>. And, as I think most people predicted, there was an outcry about privacy concerns, support for non-standard HTML extensions. Others have written lots about what the actual effect on the privacy landscape is (IMO, a slight improvement), so I won’t rehash that, and my feelings on the “divine right” of any one standards-for-a-living body to define the future of the web are pretty well-known among those who care, so you also won’t have to endure that.

What I‘m concerned about is that developers involved in this process were, in the words of at least one of them, “surprised” that there was controversy over implementation of this feature. I agree that, at least so far, the controversy seems to be based mostly on an incomplete understanding of how things are actually tracked on the web today. But there’s a difference between not thinking that the objections are valid and being surprised that people have a reaction to the proposal. The latter worries me a bit, because the emotional and social context in which we operate is pretty important to our success. We ignore that at our own peril, I think, though there would certainly also be peril in swaying with every wind. I guess this is why philosopher kings make the big bucks.

Also, somewhere between the initial bug filing, the trunk landing, the request that it go into the Firefox 2 branch, and Darin’s blog post, the original intent of this work seems to have become obscured, at least in our messaging: this is an experimental implementation to be used to gather feedback from implementors, web authors, users, and the rest of our huge world.

(Aside to the Slashdot submitter: when you link to a blog post that explicitly describes the feature and mentions that people might be nervous due to privacy fears, you might not want to say that it was “quietly” done. This was one of the louder landings for a change of its scale, IMO — which is as it should have been, also IMO.)

high fidelity

(I can only barely forgive myself for that title. I hope you can manage as well.)

After my previous post about Fidelity and Firefox, Rafael pointed me at another article about Fidelity’s adoption of Firefox. A gem from that one, emphasis mine:

Recently the center began testing the open-source Firefox browser, an alternative to Microsoft’s dominant Internet Explorer. Charlie Brenner, a Fidelity senior vice president in charge of the center, says the idea came from engineers in his department who were using it at home and liked Firefox’s advanced features, such as the ability to open new browser windows in tabs rather than in a whole separate browser, and its promise of being more secure from hacker attacks than Explorer.

Someone else agrees with, or is perhaps experiencing, my current theory on enterprises and our software: we’re better off trying to get to enterprises via users, and not the other way around. Dunno if the same logic holds for other disruptive software, especially our open source cousins, but I think that the following three-step plan is probably as useful as many wordier ones that are getting funding and publicity today:

  1. Make it easy for users to try and love your software where they can most comfortably do so (e.g., at home).
  2. Make it them wish they could have it elsewhere (e.g., at work).
  3. Help them sell it to the people who can make that wish come true.

I could easily write paragraphs upon paragraphs about each of those bullet points, talking about things like minimizing change cost and playing to the unique scaling strengths of open source communities, but you can all probably imagine what it’d look like. And I don’t have to type or edit your imaginings, so we all win.

Of course, I am not a millionaire entrepreneur success story, teenage software genius, proven technology futurist, or even venture-funded experimenter, so it’s quite likely that you can get better advice elsewhere.

next page »