One aspect of our product performance that I’m especially interested in knowing more about is responsiveness: when the user does something, how long does it take for us to act on that stimulus? (We might not complete it instantly, as when loading a new page, but we should endeavour to give feedback pretty much instantly.) One thing that sometimes interferes with our interface responsiveness is spending too long away from the main event loop, perhaps in layout or some other intensive computation. This leads to us not noticing — and thus, obviously, acting on — new events from the user such as mouse clicks or keypresses.

I’ve been talking for a while about wanting to measure our UI-thread event service latency as a proxy for that kind of responsiveness, and so this morning I finally just sat down and did it while I was waiting for Tyla to get out of bed. The patch (link fixed now, sorry) is pretty small, though it’s also rough: no control of buffer size or minimum reporting latency without recompiling, that sort of thing.

I plugged the samples from a short run into Numbers — which then beachballed a little while generating the graph, naturally — and this is the result:

Tbeachball graph

The taller portion at the right is from me going to an SVG statistics widget and dragging the sample-picker around like a bandit. On my laptop, we got a little above the 200 ms rule-of-thumb perception threshold for interactivity, and indeed I did start to notice that it was lagging a little behind my motion. I had other tabs open and stuff going on on my computer, etc., so it’s not Very Good Science, but it does give me hope that something like this monitoring will prove a decent proxy for one aspect of responsiveness. (Note that I just drop all samples that are less than 10 milliseconds, so these result sets will tend to give a negative impression of the application, by focusing only on events that actually take meaningful time to process.)

2 comments to “Tbeachball”

  1. Fredrik
    entered 25 August 2007 @ 12:51 pm

    Patch link doesn’t work as I’m typing this (404).

    Could a test like this be automated fully in software?

    One can, of course, build a little USB-controlled hardware box based on a micro controller board that hooks up to the PS/2 ports and sends a series of mouse/keyboard events and measure “Tbeachball” based on that. To predictably repeat the actual hardware input events human interaction would yield. (Not sure if there is testing equipment that does this already.)

  2. Boris
    entered 25 August 2007 @ 11:58 pm

    It’d be really nice to get data for this while running Tp!

  3. entered 28 August 2007 @ 3:19 am

    This is really great data to have!

    >we got a little above the 200 ms rule-of-thumb perception threshold >for interactivity

    This threshold is actually often called Tp (time for perception), but I like Tbeachball a lot more. 200 ms is actually the high end, the range is 50-200 ms.

    “The reason for the range is not only variance in individual humans; it is also varies with conditions. For example, the perceptual processor is faster (shorter cycle time)for more intense stimuli, and slower for weak stimuli.” Page 9, http://ocw.mit.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-831Fall-2004/0A79F491-80BA-4E19-885C-1E7E481FA2A3/0/L4.pdf

    So when looking at this data, we should consider the range. 200 ms is probably good for a 16×16 icon appearing in the location bar, but we might want to shoot for 50 ms for more drastic changes, like a dialog box appearing.

  4. entered 28 August 2007 @ 4:56 am

    Would having a multithreaded GUI improve user perceived performance? I find sites that use Java impossible to use, the whole GUI locks up, I can’t even switch tabs while I’m waiting for the page to load.

    I think that would be a huge gain, what do they call it? Low hanging fruit?

    Also, my project: http://teethgrinder.co.uk/open-flash-chart/ shows nice graphs in web pages :-) Yeah, I know it is the evil flash. But if IE ever gets SVG I’ll port it over.