fsyncers and curveballs

[Update: Hi, interwebs! Had to block my blog for a little bit to deploy the full power of wp-super-cache, everything should be fine now.]

A couple of articles have now been written about the unpleasant behaviour that people encounter with Firefox 3 in certain Linux configurations, related to the flushing of I/O and system lag that can see if there is a lot of other disk activity at the same time. There are a lot of moving parts to this issue, and so it’s not surprising that there’s a fair bit of misunderstanding, though some of it seems less well-meaning that others, which makes me a bit sad.

What’s Firefox doing?

Firefox uses a database called sqlite as the underpinning for many kinds of data storage in Firefox 3, including the browsing history and bookmarks data used to provide the Awesomebar’s awesomeness. sqlite is an excellent piece of software, written by people who take both data integrity and performance very seriously, which makes it a great place to put this sort of data. Lots of people use sqlite these days, and we’re proud to be founding members of the consortium that helps support sqlite development.

Databases, perhaps obviously, usually have complex file formats, and require that different parts of their files agree about things like how many records there are, whether a transaction completed successfully, or how indexes match up with the data to which they refer. This makes them more sensitive to data corruption than some simpler formats, like a basic text file. If you get a chunk of a text file corrupted, you can probably edit around it and salvage the rest of the file, but if you get a chunk of a database file corrupted you can often effectively lose all of the data that’s held there. This is one of the tradeoffs for being able to have efficient access to large sets of data, and it’s common to virtually all databases.

One of the things that sqlite does to ensure that the database is not corrupted in the case of a crash is call a function called fsync. fsync tells the operating system to ensure that this file has been safely written to disk, and waits until that’s complete. This provides what’s known as a “barrier”, and makes sure that we don’t get mismatched parts of transactions (groups of related database operations) when we look at the file after a crash. This is very effective: even if the operating system itself crashes or the computer loses power suddenly, we won’t see the database corrupted.

We don’t want to lose the user’s data, because that makes users sad, and we like to make users happy. So up through Firefox RC1, we set sqlite to its recommended setting of “FULL” synchronization. As release-end-game luck would have it, that made some users sad, because they would find Firefox and sometimes other parts of their system would pause unpleasantly, and it was tracked down to Firefox calling fsync. The bug in question is here, but I caution you to not read it piecemeal; there’s a lot of intertwined conversation there, and some comments are not as correct as they sound.

(There have been other bugs along the way that could cause this, ranging from performance problems with certain sqlite versions to bad interactions with the data set used for malware protection, but those were put behind us by RC1. This bug is the one that has people working weekends at this point.)

Why does that hurt so much?

On some rather common Linux configurations, especially using the ext3 filesystem in the “data=ordered” mode, calling fsync doesn’t just flush out the data for the file it’s called on, but rather on all the buffered data for that filesystem. Because writing to disk is so much slower than writing to memory, operating systems can buffer a lot of data, especially if you’re doing something that involves a lot of I/O, like unpacking a zip file or compiling software. It gets written out in the background, giving you vastly, vastly improved performance. It’s no exaggeration to say that without this sort of buffering your computer would be entirely unusable.

I think you can see where this is going: if there’s a lot of data waiting to be written to disk, and Firefox’s (sqlite’s) request to flush the data for one file actually sends all that data out, we could be waiting for a while. Worse, all the other applications that are writing data may end up waiting for it to complete as well. In artificial, but not entirely impossible, test conditions, those delays can be 30 seconds or more. That experience, to coin a phrase, kinda sucks. Does it suck as much as file corruption wiping out your bookmarks after your computer (not Firefox) crashes? As you might imagine, opinions vary.

This problem with ext3 is well-known to Linux kernel developers, and there’s great work underway as part of the “ext4″ project to remedy it. Other filesystems (like reiser4, I have heard) have similar problems, and I presume that their developers are also working on resolving them.

Why doesn’t other (non-sqlite) software do this?

Actually, a lot of software that’s concerned with data integrity does this, including the editors emacs and vim, and mail clients like mutt and Evolution, as well as bigger databases like MySQL and Postgres. In some cases, those programs are in fact adding more calls to fsync to protect user data better.

In fact, so many programs use fsync to ensure data integrity, and actually writing to disk is so expensive, that some operating systems make fsync not be a “real” fsync: the data is scheduled for (hopefully) immediate write-out, but the call doesn’t wait until it’s all the way to the disk, so it’s not really an effective barrier. This may be permitted by various standards like POSIX, but it’s certainly surprising for programs like Firefox that use it to protect against data corruption in the case of a crash!

Here’s what Apple has to say about fsync, sqlite, and data corruption:

fsync on Mac OS X: Since on Mac OS X the fsync command does not make the guarantee that bytes are written, SQLite sends a F_FULLFSYNC request to the kernel to ensures that the bytes are actually written through to the drive platter. This causes the kernel to flush all buffers to the drives and causes the drives to flush their track caches. Without this, there is a significantly large window of time within which data will reside in volatile memory—and in the event of system failure you risk data corruption.

Exciting!

If this is an operating system bug, why is Firefox being patched?

Because we want to make users happy. Whether Linux should have better fsync behaviour or not isn’t really going to matter to our users — we want to support Linux, which means Linux-as-she-is-shipped, not Linux-as-we-would-like-her-to-be. That means that we need to deal with X servers, with font-selection systems, and with filesystem behaviour as we find it, because that’s where our Linux users are. (It’s not like Windows and OS X don’t have their own annoying things to work around either, though they don’t seem to have this specific one!)

So is it always going to be like this?

No! (Yay!)

In the immediate term, there is a patch that controls how aggressively we sync, and defaults to a slightly less-aggressive state that is equivalently safe on modern operating systems. That patch will be in either Firefox 3.0.1 at the latest, and we’ve been in contact with Linux distributors to make sure they’re aware that the patch is fine to take in their builds — desirable, even. It might be in Firefox 3.0 proper, depending on what happens with an RC2, but either way the vast majority of affected users (Linux users, who usually get their Firefoxes from their distributors) will get the fix right away. This patch also lets users, who might decide that their systems are stable enough and their backups good enough that they don’t need the extra protection, turn off the data-integrity fsyncs almost entirely.

In the medium term, we’re going to be making our database use more asynchronous in Firefox, and batching transactions for things like history. You’ll likely see those effects in the major version of Firefox that follows 3 (might be called 3.1, might be called 3.5, might be called Firefox Magenta, who knows?). This will keep Firefox from pausing in these states, but may not entirely keep the fsync calls from affecting other applications on the system. The sqlite developers are also looking at adding support for a newer, Linux-only API that doesn’t have the system-wide effects as fsync, which could help as well.

Longer term, as I mentioned above, the Linux filesystem situation will improve in this respect, and the work is well-underway. It’s probably a year at least before most Linux users are running systems with fixed fsync behaviour, but at that point application developers won’t have to worry about their data-integrity needs causing pain for the whole system. I, for one, am looking forward to it.

I’d like to thank the sqlite developers for their help analyzing the effects of different fsync patterns, various Linux kernel developers (especially my former colleague Andreas Dilger), and the users who have helped test different settings and I/O loads. We’re going to ship a better Firefox 3 on Linux because of it, and we have even more ways to improve that experience on all operating systems in the future. I would not like to thank the people who have substituted vitriol and dogma for analysis and understanding of the release cycle, but we all get frustrated sometimes. I hope they enjoy Firefox 3 too.

high fidelity

(I can only barely forgive myself for that title. I hope you can manage as well.)

After my previous post about Fidelity and Firefox, Rafael pointed me at another article about Fidelity’s adoption of Firefox. A gem from that one, emphasis mine:

Recently the center began testing the open-source Firefox browser, an alternative to Microsoft’s dominant Internet Explorer. Charlie Brenner, a Fidelity senior vice president in charge of the center, says the idea came from engineers in his department who were using it at home and liked Firefox’s advanced features, such as the ability to open new browser windows in tabs rather than in a whole separate browser, and its promise of being more secure from hacker attacks than Explorer.

Someone else agrees with, or is perhaps experiencing, my current theory on enterprises and our software: we’re better off trying to get to enterprises via users, and not the other way around. Dunno if the same logic holds for other disruptive software, especially our open source cousins, but I think that the following three-step plan is probably as useful as many wordier ones that are getting funding and publicity today:

  1. Make it easy for users to try and love your software where they can most comfortably do so (e.g., at home).
  2. Make it them wish they could have it elsewhere (e.g., at work).
  3. Help them sell it to the people who can make that wish come true.

I could easily write paragraphs upon paragraphs about each of those bullet points, talking about things like minimizing change cost and playing to the unique scaling strengths of open source communities, but you can all probably imagine what it’d look like. And I don’t have to type or edit your imaginings, so we all win.

Of course, I am not a millionaire entrepreneur success story, teenage software genius, proven technology futurist, or even venture-funded experimenter, so it’s quite likely that you can get better advice elsewhere.

halos and security holism

A nice article about Fidelity and open source has two things that I find especially nice, in this one paragraph alone:

The Mozilla Firefox browser was an eye-opener, added Mike Askew, who also works in the technology center. A head-to-head comparison of Firefox and Internet Explorer showed that both had about the same level of security vulnerability, but ”the time needed to fix vulnerabilities in Firefox was much less,” Askew said. That experience led Fidelity to look at open source more intently.

First, I do quite like to hear that our success is making people look at other open source offerings more seriously. It’s not a primary goal for the project, but it’s one of the nice unintended consequences that we get as a bonus.

Second, I like to see people evaluating security characteristics of software in a more nuanced way than simple advisory or vulnerability count. Not all bugs are equal (as is perhaps obvious now, in the throes of the WMF vulnerability, though that’s not an IE bug), and even with severity weighting you are still faced with what are likely even more important questions. Chief among them might well be “how long am I likely to be exposed once a bug is found, or publicized?” If you believe that history is a useful, if imperfect, guide, then something like this vulnerability-window study might be of interest. If not, then you’ll have to do more research, which I very much hope you’ll publish.

capitalizing

He resists falling in to the trap of predicting Portland means 2006 will be “the year of Linux desktop,” but is confident it can capitalize on the buzz that Mozilla’s Firefox has created around open source software on the desktop. Firefox has gained 11.51 per cent of the browser market in the year since its release.

I will be very interested, as Mozilla’s representative to the Portland summit, to follow this effort. I don’t think that most of the people in that 11.51% (I love the precision there!) use Firefox because it’s open source, or perhaps even know that it is. Well, I’m being pretty generous here. I’d be surprised if more than 0.51% used Firefox because it was open source, and I’d be very pleasantly surprised to discover that more than a few percent knew that it was, and what that meant.

I do hope that a growing understanding of the value — to more than just the Mozilla project — of the Firefox brand will help alleviate some long-standing issues here, but even more I hope that the “rest” of the open source desktop can learn from what we’ve done well and poorly, and use that to inform their own path. That’s not a guarantee of success for anyone, to be sure, but it seems like something that would be of interest to those projects. (I have a bit of trivia about that very interest from the Summit, but that’s a whole other story.)

As an aside perhaps of interest to nobody, I think that the “open source desktop” is much much more interesting these days than the “Linux desktop”, with the possible exception of OLPC, and that it’s a lot easier to switch the OS after you switch the parts that touch the users. (The flowers, in many cases, remain standing.)

truth in advertising

As I mentioned before, I’m in Portland this week meeting with a bunch of Linux desktop people to talk about barriers to improvement and adoption of Linux desktop software.

One of the things that has come of out of this meeting already is a commitment from jdub and the other freedesktop.org guys to supporting more than just X or Linux platforms for their unifying standards and technology. Apparently their mission statement‘s relentless focus on X is a historical accident, and not to be taken seriously. Good news indeed, I think, because it means that projects like Mozilla and OpenOffice could get involved more, and maybe benefit from the work beyond the cairo stuff.