A while ago I made a prototype of a simple, high-performance keyword search for bugzilla, using python and Redis. It was pretty promising, able to do all/any queries against the summaries of all open Firefox bugs in a handful of milliseconds. The bugzilla maintainers decided to use the database’s full-text engine instead, but I was nonetheless pretty pleased with it.
A month or two ago, I started rewriting it it atop node.js and gave it a simple websocket-based interface. (It took a while, due to The Unpleasantness rather than the difficulty of the task itself.) It was blazingly fast, easily able to keep up with real-time matching of entered keywords. The queries on the server side were finished in a dozen milliseconds or less, and the remainder of my 200 millisecond request budget was spent on data transfer back to the client. I had to limit the returned result set to the most recent 100 items to keep the transfer time down. I will probably look at compression in JS to improve things there. (Some requests take as much as half a second if I get unlucky on the network and result-set size.)
Building the index takes about 20 seconds on its wimpy VM, though retrieving all the summaries from bugzilla takes a fair bit longer. (It currently indexes open bugs in Firefox, Fennec, Core, Toolkit, NSPR and NSS.) I could do that on a timer and have it be current within 10 minutes, but it would be pretty wasteful: the rate of relevant change is on the order of a few dozen each minute, and updating that many summaries is literally microseconds.
Enter Pulse, Christian Legnitto’s mozilla-wide message broker. Wired up to Pulse, my index is up to date cheaply every 5 minutes, and once we deploy the bugzilla extension, the index will be updated within fractions of a second.
To support non-websocket browsers, which is currently all of them, I switched to using socket.io for the transport. The flashsocket transport adds about 100 milliseconds to the round-trip, which is unfortunate but here we are.
In all, it’s about 500 lines of JS on the server side, plus another ~150 lines for the scripts that do the loading, and I think it’s a great example of what tools like pulse and the Bugzilla REST API are going to make possible in 2011. It’s also an example of how holy-crikey excellent Node and Redis are together.
(There’s an installation running here, which might randomly break as I hack on it.)
Update, 2 minutes after posting:
23 Jan 00:16:00 - searching for sex 23 Jan 00:16:00 - undefined: search 1 -> 1 in 0/1 ms