leaking, growing, and measuring

(This post started small, but got bigger as I noticed more things that aren’t necessarily as obvious to my readers as they are to me, with respect to our process and software. So it grew over time, oh ha ha! It’s almost 1AM, so I will not be editing it further this evening! I might post a summarized version at some point in the future, or I might not.

And then I edited it because Dave pointed out that it sounded like I was saying that other browsers necessarily suffered similar fragmentation woes, which wasn’t my intent. Indeed, the main point of the post is that there can be many possible causes for a given symptom, and that the popular theories (e.g. “massive memory leaks”) may not prove correct.)

I’m going to share some non-news with you: Firefox has memory leaks. I would be shocked to discover that there were any major browser that did not have memory leaks, in fact. Developers in complex systems, be they browsers or video games or operating systems, fight constantly against bad memory behaviours that can cause leaks, excess usage, or in the worst cases even security problems.

(As an aside, it’s still quite, quite common to read articles which reference this long-in-the-tooth post from Ben as the “Mozilla development team” denying that there are leaks in Firefox. You would have a hard time getting any developer to say that there are no leaks in Firefox, and indeed the post in question says second sentence that Firefox has leaks. You do not need a secret nerd decoder ring here to interpret the text, just basic literacy. Also, it’s no secret that Ben hasn’t been active in Firefox development for quite some time, so for people to point at an article that’s thinking hard about what it would like for its second birthday, rather than actually contacting any of the rather visible and accommodating developers of today — well, it just feels kinda sloppy to me.)

So, Firefox has leaks, and Firefox uses a lot of memory in some cases. A student of logical fallacy will no doubt have no difficulty setting development priorities: to reduce the amount of memory used by Firefox, fix all the leaks. In this case, though, a student of Mencken can happily triumph over the student of fallacy, for even with multifarious leak fixes we would still see cases where Firefox’s “used memory” was quite a bit higher than leaks could account for.

Let me now take you on a journey of discovery. Measuring leaks — contra identifying their root causes or fixing them — is actually quite simple: you count the total amount of memory that you ask the operating system for (usually via an API called malloc), you subtract the amount of memory that you tell the operating system you’re done with (usually via free), and if the number isn’t zero when your program exits, you have a leak. We have a ton of tools for reporting on such leaks, and we monitor them very closely. So when we see that memory usage can go up by 100MB, but there are only a few kilobytes leaked, we get to scratching our heads.

Schrep, our intrepid VP of Engineering and sommelier, was doing just this sort of head-scratching recently, after he measured some surprising memory behaviour:

  • Start browser.
  • Measure memory usage (“Point 1″).
  • Load a URL that in turn opens many windows. Wait for them to finish loading.
  • Measure memory usage (“Point 2″).
  • Close them all down, and go back to the blank start page.
  • Measure memory usage again (“Point 3″).
  • Force the caches to clear, to eliminate them from the experiment.
  • Measure memory usage again (“Point 4″).

You might expect that the measurements at points 1 and 4 would be the same, or at least quite close (accounting for buffers that are lazily allocated on first use, for example). You might, then, share the surprise in what Schrep found:

Point 1Point 2Point 3Point 4
35MB118MB94MB88MB

(You can and should, if you care about such things, read the whole thread for more details about how things were measured, and Schrep’s configuration. It also shows the measured sizes for a number of browsers after this test as well as at startup with some representative applications loaded. You may find the results surprising! Go ahead, I’ll wait here!)

So what does cause memory usage to rise that way, if we’re not leaking supertankers worth of memory? Some more investigation ruled out significant contribution from the various caches that Firefox maintains for performance, and discovered that heap fragmentation is likely to be very significant contributor to the “long-term growth” effects that people observe and complain about. Heap fragmentation is a desperately nerdy thing, and you can read Stuart’s detailed post if you want to see pretty pictures, but if you’ve ever opened a carefully packed piece of equipment and then tried to put it all back in the box, you’ve experienced something somewhat similar; if you take things out and put them back in different orders, it’s hard to get every thing to fit together as nicely, and some space gets wasted.

The original design for Gecko placed an extremely high premium on memory efficiency. The layout code is littered with places where people did extra work in order to save a few kilobytes here or there, or to shave a few bytes off a structure. If you compute the classic malloc/free running total I mentioned above, I think you’ll find that Gecko typically uses a lot less memory than competitors. But, as I hope I’ve made at least somewhat clear here, there’s more to managing the memory impact of an application than simply balancing the checkbook and keeping your structures lean. When and how you allocate memory can be as-or-more important in determining the application’s “total memory footprint” than the things that are simple to theorize about. And making sure that you’re measuring the same things that users are seeing is key to focusing work on things that will be the maximum benefit to them, in the shortest time. We’re working now on ways to reduce the effects of heap fragmentation, just as we’ve invested in fixing leaks and improving our tools for understanding memory consumption and effects, and the outlook is quite promising.

The real punch line of this for Firefox users is that Firefox 3 will continue to improve memory behaviour over long-term usage, and you’ll soon be able to try it out for yourself with the upcoming Firefox 3 beta. Beta 1 won’t have the benefits of the work on fragmentation reduction, but many testers are already reporting dramatically improved memory consumption as well as significant performance gains. We’re never satisfied with the performance of Firefox, just as we always seek to make it more secure, more pleasant to use, and nicer to smell.

3 comments to “leaking, growing, and measuring”

  1. entered 13 November 2007 @ 2:48 am

    I don’t think such tests applied to other browsers are particularly scientific without knowing what those browsers are doing on purpose. For example, WebKit is deliberately aggressive with its memory consumption on machines with sufficient RAM to handle it. Our caches scale with available physical RAM.

    WebKit also uses TCMalloc, which doesn’t necessarily give memory back to the system immediately when we free it. Therefore a certain amount of growth (even possible continued growth) is expected for WebKit. This isn’t to say that memory leaks and fragmentation problems don’t exist, but one should be wary of drawing conclusions about what might be going on in each browser engine.

  2. pd
    entered 13 November 2007 @ 4:57 am

    What exactly did you hope to achieve from this post? All it seems is another denial from a core developer and/or another round of excuses.

    Is it possible that when you use Firefox you do not experience any of the crippling memory leaks average users have experienced?

    Not every user has the wealth of knowledge (and apparent arrogance at least in tone) you do, but it doesn’t take a genius to realise that people are experiencing unreasonable performance from Firefox with no evidence of long term strategy and/or tactics for resolving these problems.

    So in your one test in your specific circumstances, Firefox is slightly less bloated in it’s memory usage that a couple of other browsers (I notice you didn’t compare to Opera). Is that good enough? Is that enough to give users a reason to recommend the browser to the existing IE masses and therefore improve the stagnant market share and help influence MS to stop screwing people like they are doing with ECMAscript 4?

    I haven’t recommended Firefox to anybody since around the time of your referenced Goodger post without crossed fingers behind my back. There’s been several rounds of the blame game (Extensions for one), defensive posts like yours and umpteen posts from Mitchell about everything but this very core issue.

    At least Stuart’s posts actually present a meaningful examination of the issues. That’s what I was expecting when I read this post but instead there’s more echoes from the two year old Goodger post.

    Perhaps instead of nursing her half-million wad of cash, Mitchell should show some leadership for once and actually launch a campaign – backed by cash – to reduce the memory footprint of Firefox by a meaningful level.

    Instead it seems that an arbitrary direction change towards mobile devices (what happened to Minimo?) is forcing this latest round of discussion on memory usage.

  3. entered 13 November 2007 @ 7:35 am
  4. entered 13 November 2007 @ 8:40 am

    Dave: you’re totally right, and I didn’t mean to imply that Safari or IE saw growth for the same reasons that we’re finding in Firefox — and I don’t really have cycles to do the analysis on Safari, perhaps obviously! I am indeed all about the fact that reasoning from memory symptom to memory cause is fraught with peril, and I’ve updated the post to make that reading much less likely. The multi-browser chart was actually a holdover from a previous draft that was themed very differently, and I should probably have just started from a blank slate rather than juggling it at midnight. I hope the edits make it clearer, thanks for calling that out.

    pd: you appear to have some anger issues, but I’m not sure how you think it’s defensive: I tried to be quite clear about what we’re doing to improve Firefox’s memory footprint, because in addition to (the proportionately minor) leaks we’ve fixed, we’re also seeing that we cause some pretty ugly behaviour with fragmentation currently. This isn’t blaming Microsoft or Apple for the allocator or anything; we’re responsible for our application’s memory behaviour, and we’re working to fix it (by changing how we allocate, or using a different allocator that has better tradeoffs for us, for example). Many people have expressed concern about Firefox’s memory use over long browsing sessions, and I thought that they would be interested to see inside some of that work. I’m sorry if it wasn’t outright self-immolation, I guess.

  5. Doug
    entered 13 November 2007 @ 8:43 am

    pd – I suggest you learn some basic reading comprehension…

  6. entered 13 November 2007 @ 9:28 am

    I think there are a few misconceptions out there:

    1. People think extensions are leak-proof. When it’s been shown that extensions out there do leak memory. When you add stuff to the browser, you add the potential for problems. IMHO no question Firebug slows things down a little at times (one would expect it to), but IMHO that’s a worthwhile payoff for the functionality it provides. People need to understand extensions a little better.

    2. “Open Source” is associated heavily with Linux, which will run on your wrist watch. People assume it’s all the same in design. Firefox isn’t designed for your wrist watch, it’s currently designed for your computer, and soon your mobile device.

  7. entered 13 November 2007 @ 12:29 pm

    Aww, I thought this was gonna be a baby post! :)

  8. entered 13 November 2007 @ 3:56 pm

    [...] Want to know what makes me tick? Blog posts like this and this. I love this kind of performance monitoring and optimisation work. Sometimes I wish this was what my job was all about. [...]

  9. entered 13 November 2007 @ 6:47 pm

    pd: Reread the article. Notice how he states that he states that contrary to reports that FF Devs are denying mem leaks, they acknowledge them? Notice where he points to an ancient post from Ben about it? That’s not denial.

    Further, the ENTIRE POST is about WHY it turns out Firefox gobbles RAM. In no way is it a denial. It’s an admission of “guilt” then a statement about how a huge goal is to fix it.

    So, again, where do you see a denial?

  10. pd
    entered 15 November 2007 @ 10:17 am

    Just basic literacy? How about reading the THIRD sentence of Ben’s post – the one that says “What I think many people are talking about however with Firefox 1.5 is not really a memory leak at all. It is in fact a feature.”

    Yes, you read that correctly: “It’s not a bug, it’s a feature”.

    Perhaps I’m not the one who needs the “basic reading comprehension” skills?

    Or perhaps my “basic comprehension skills” are not so bad considering the post to which I commented has now been edited to clarify it’s direction?

    Regarding Ben Goodger’s involvement. The last I heard he was physically working at Google but still doing Firefox work. I haven’t seen an announcement to the contrary and I read Planet Mozilla every day. Perhaps I’m meant to interpret the absence of his posts on the Planet Mozilla feed as an ‘announcement’ but unlike the helpful “Planet Mozilla Addition” posts, there don’t seem to be any “Planet Mozilla Removals” posts.

    I think this highlights an few issues. One is alluded to in Mike’s post with his use of quotes around ‘”The Mozilla development team”‘. In short I think there is probably a lot of angst in the community (that’s the people who have the time to be involved, not just jo bloggs who wants a browser that works) because there is no real collective policy or voice heard from this arbitrary group of hard-working people – ‘the Mozilla development team’. Certainly Mitchell Baker’s posts never tend to refer to what ‘the Mozilla development team’ is thinking or doing. AFAIK there is no single source of information that communicates the direction and impressions of the low level core developers. Instead anyone interested in questioning ‘the Mozilla development team’ is forced to run the flame guantlet by commenting on blog posts like this.

    Now the last thing I would encourage is a situation where developers are bogged down in committees or discouraged from blogging at all. That’s not my point. There must be a happy medium. Perhaps a section on the devmo wiki regularly updated? Topics could include:

    • What state we feel Mozilla’s memory footprint is currently at, and what we are doing in this area

    • What direction the core developers are heading in with regard to features vs stability/performance, etc. Key metrics like the current number of talkback crash bugs and where that sits on a long term graph would be very interesting.

    It’s great that anyone can reach some of the key developers via blog comments and it’s also a flawed and disjointed means of communication.

    I apologise if my comment was overtly flame-like. I was angry when I wrote it which is not a great state to be in when posting comments. On the other hand, anger s an emotion towards Firefox that clearly needs communicating; an emotion that the average punter out there who does not read Planet Mozilla would never likely communicate.

    I think there must be a better way.

    P.S. It’s a bit tough to write a balanced comment when there’s no preview/edit facility available with most blogware :)

  11. entered 15 November 2007 @ 5:14 pm

    pd: I think it’d be great to have such a page or regular report circulated, and if you would like to help gather such disparate threads from things like the leak and performance SWAT meetings — public and open to all, including on-topic agenda items via the wiki — I’ll definitely help make sure that it finds a good home and gets visibility. I’d have made this offer via email, but, yeah. Can you point at examples of other projects that produce the sort of coherent descriptions of targets and so forth? I would love to have some good models to learn from.

    I think the “flame gauntlet” in your case comes most significantly from the rudeness and personal attacks, rather from “questioning the development team” — there are routinely quite productive and polite conversations about performance (targets, techniques, results, etc.) in bugs and mailing lists and IRC. I admit that I find it difficult to convince myself that engaging in discussion with you will be productive, given the constant references to irrelevant details and hostility. I’m persevering in this case in hope that the glimmers of constructive suggestion will manifest themselves in something that feels more helpful and less like self-indulgent venting.

    You should feel free to post a more balanced comment elsewhere on your own blog, or via email to the dev.performance group, etc. and just track back to here; I follow such things. Please don’t let the small text box compel you to flame, by any means.

    We generally don’t announce when people stop working on the project, though people are welcome to do it for themselves of course. I think it’s pretty widely known by people who follow the development and community that Ben hasn’t been active in Firefox in really any form in quite some time. Maybe you weren’t aware of that, but now you are, so the system sorta, kinda works. (Eventually.) If you don’t want to take my word for it, you could ask him yourself, naturally.

  12. Sean
    entered 17 November 2007 @ 2:43 am

    Heh, I wrote about this some years ago. Any application that uses such large amounts of data that it allocates and frees on such a frequent basis as a browser is going to end up with serious memory allocation problems, unless something specific is done to counteract it.

    Take a look at the mobile world, especially the game console world. They rarely if ever use generic memory allocation routines like malloc() or free(). Every structure is packed, not necessarily for space, but to avoid having structures with pointers that go out to “random” bits of memory spread all over the place. Contiguous blocks of memory are used to store all the related data for any part of the application. Fragmentation is a lot rarer when the application need only release a single block of memory instead of dozens of blocks spread all over the heap.

    This is also a reason that higher-level languages that don’t force programmers to work with raw memory addresses can, despite what a lot of the bitter old crustie programmers think, actually improve performance and memory consumption. A JVM with a good compacting GC avoids the memory fragmentation issue (too bad Java in general promotoes to allocation of billions of little chunks of data almost as much as it promotes the use of billions of threads), which when combined with a good AOT/JIT Java bytecode->machine code compiler can easily give you the raw performance of C combined with memory and cache behavior that aren’t really feasible to get in languages requiring manual memory management (or even assisted manual memory management, a la C++).

    Mozilla certainly isn’t going to be rewritten in a language that has automatic memory management (even if it was, there’s no guarantee the common implementations of that language use good automatic memory management – many VMs still use conservative collectors like the Boehm GC), and completely restructuring every data structure to compact allocations would be almost as hard as switching language. Mozilla should instead identify the biggest offenders, and make use of alternative allocators that compact memory.

    For example, each page might get a single large block for storing all information about links (I imagine there are a lot of trees and list between small allocated objects to handling that). Instead of just calling malloc()/new to allocate those objects, the objects instead get allocated with a custom allocator that finds a free entry in that block and returns that address. If the block is outgrown, a second block could be allocated. Sure, if blocks are designed to hold 50 link objects, smaller/simpler pages will have a lot of wasted space (where “a lot” is probably well under 1Kb), but when that page is released from memory, that block of links is freed without leaving a bunch of tiny little holes all over the place. Heck, the block itself will probably just be added to a free list to use for the next page load that requests a block for storing links.