It was a superb idea. But not even a whole year after its launch, Mozilla engineers are already noting the design may be obsolete, even outmoded, and maybe in retrospect not even particularly well-executed.
In typically introspective Mozilla fashion, JIT compiler engineer Dave Anderson came out with it last week on his blog for the organization, fittingly entitled “Mystery Bail Theater:” TraceMonkey is very good, except for those times when it’s very not.
“TraceMonkey is pretty powerful. It carefully observes loops and converts them to super-fast assembly,” Anderson wrote. “That’s great and all, but there’s a problem: sometimes tracing doesn’t work. Loops can throw curveballs that cause tracing to stop. Especially with recursion, or lots of nesting, it can be very difficult to build good traces on complex code.”
A mockup of the latest UI changes planned for Firefox 4.0, which now include a relocated Home button and a new Bookmarks button. [Courtesy Mozilla]
The absolute timetable for the production of Firefox 4.0 isn’t known. That’s not because Mozilla keeps it under wraps; it’s because it’s not really determined yet. The latter part of the year is the cloudy window we’re seeing now. The spotlight feature of 4.0, up to this point, is said to be its completely remodeled front-end, borrowing more minimalist ideas from Google Chrome, Apple Safari, and now Opera. But now that speed is a principal issue in browser users’ criteria, the next point-release of Firefox will need to at least make up the performance ground that now separates it from all other alternative (non-IE) browsers in the field.
Setting an absolute date may depend on how long it takes for the Firefox code base to recover, if you will, from an extensive surgical procedure — quite literally, a graft. Users who’ve been wondering why Mozilla doesn’t just simply adopt WebKit as its browser engine, will be interested to learn it’s actually working to embrace a piece of it — not conceptually, but literally.
On paper, Nanojit’s procedure should work faster than WebKit, the open source browser engine used by Safari and adopted by Chrome. It doesn’t, for reasons Mozilla’s Anderson describes as pertaining to the growing multitude of exceptions of code that simply can’t be compiled down.
An example of just one kind of fallback exception that happens frequently was described in a recent blog post by Mozilla contributor David Mandelin: “Tracing works by generating type-specialized native code for program paths. So if a program has 1000 paths in its hottest loop, TraceMonkey would have to generate 1000 paths to run it natively with tracing. But that would use up way too much memory for code, so instead TraceMonkey stops after a certain number of paths and falls back to the interpreter.”
By contrast, Anderson described, WebKit has a component called Nitro that compiles not threads that are traced in advance, but big chunks of code — entire methods. Its process is called inline threading.
In recent Betanews tests using, ironically, code whose corrections were suggested by Opera Software, we got a glimpse of the strengths and weaknesses of both approaches: When optimizing the very same loop a hundred thousand, or a million, times, Firefox’s Nanojit methodology executes the resulting code much quicker than any other browser, including the new Opera 10.5 and Chrome 5 dev builds. That’s why, in this one test battery of ours, Firefox 3.7 Alpha 2 scores a 4.34 versus the WebKit/Safari daily build’s score of 3.91, and all other competitors slower from there.
Repeating the same stuff over and over is one case where TraceMonkey, to use Anderson’s analogy, applies rocket boosters. But real-world code doesn’t repeat stuff a million times. In those cases, TraceMonkey’s boosters never come on. Nitro ends up being more efficient — it has optimizations for certain types of methods planned well in advance, whereas Nanojit starts from scratch.
The architectural solution, which is being improved upon even as I write this, is being called JaegerMonkey. Using this architecture, developers are transplanting the TraceMonkey component with the inline threading component of Nitro (with permission), which will optimize the JS code the same way WebKit does. Then they will bolt TraceMonkey back onto the Nitro graft, to trace and improve the exceptions that Nitro cannot optimize. As Mandelin reports, Mozilla contributors such as Anderson and Luke Wagner have been making adjustments and improvements to Nitro, to make it fit better in the Mozilla scheme.
“We’ve barely started and the results are already really promising,” reports Anderson. “Running SunSpider on my machine, the whole-method JIT is 30% faster than the interpreter on x86, and 45% faster on x64. This is with barely any optimization work!”
Sadly, the odd man out appears to be Nanojit, one of the components that was responsible for the huge speed gains in Firefox 3.5. The work of Mozilla contributor Nicholas Nethercote, Nanojit was being redeveloped for a new Firefox version as late as last month, with an eye toward optimizing the assembly-like code base that it generates.
As notes from Mozilla planning meetings indicate, they considered using Nethercote’s optimized Nanojit, but finally opted against it. “The main problem is that there is not enough control over the results to get the best code. In particular, there are tricks for calling stub functions (functions that implement JS ops that are not inlined) very efficiently that Nanojit doesn’t currently support. We think there will be other tricks with manual register allocation and such that are also not currently supported. We don’t want to gate this work on Nanojit development or junk Nanojit up with features that will be non-useful for it’s current applications. Also, the compilation time is much longer for LIR than for using an assembler.”
Next: The road from here to there…
The road from here to there
To help Firefox make the quantum leap from its current state to a level competitive with the others, version 3.7 will serve as a kind of waypoint. Already, we’re seeing some other performance features being tried in the new Alpha 2 and the latest daily development builds of Alpha 3.
The latest builds now feature Mozilla’s first round of minimalistic tweaks to the skin, which include getting rid of one button in the controls. Since there’s no reason why the Refresh and Stop buttons would be used together, they’re now the same button, with the icon changing as necessary (Stop as the page is loading, Refresh once it’s finished).
Last November, Firefox 3.7 alpha code was commandeered for an unofficial test of rendering using Direct2D, the Windows graphics library that replaces the decades-old GDI. Internet Explorer 9 is slated to use Direct2D, and the early IE9 tests we saw last year (the only ones revealed thus far) showed tremendous gains in the graphics department. But the Mozilla engineer’s test was somewhat inconclusive, resulting in a build that actually appeared to slow down page load time when rendering with Direct2D instead of GDI.
Comparative Web page load times for Firefox 3.7 Alpha 2 rendering with GDI graphics library, and Firefox 3.7 Alpha 3 rendering with Direct2D graphics library.
Now with Direct2D support having been officially welded into the latest Firefox 3.7 alphas for the first time, the results are a little more like what folks expected: In a test (originally suggested to us by Microsoft) of page load times for 26 leading Web sites, a daily Alpha 3 build with Direct2D rendering turned on did load pages, on average, 5% faster than the latest general Alpha 2 with Direct2D turned off. But as the graph shows, test results per page could vary wildly, and more often than not, Alpha 3 was much faster than that.
From what I can see with my own two eyes, Direct2D rendering on Firefox 3.7 (and the early IE9, for that matter) is crisper, cleaner, and snappier than with GDI. As this feature becomes more prevalent in browsers everywhere, in turn, it will become relevant to more users. So at some time in the near future, it will become necessary to include graphics library rendering time as an element of Betanews’ “comprehensive” browser tests, if we truly wish to go on using that word to describe them.
It’s usually Microsoft that keeps telling us that true browser performance is something its user can see and feel. Coming from the manufacturer of the perennial slowest browser in the field, that must be a heartfelt statement. In the next few weeks, Microsoft will be making its case for browser performance — something it hasn’t been able to lay claim to for some time. Until that company gives folks reason to go back to being disappointed, IE9 will be perceived as gaining on Firefox.
Which means, Mozilla cannot afford any more to evolve on its own schedule; in this competitive environment (which is what it said it wanted), the schedule is set for it. Firefox 3.7 needs to put all these improvements together and demonstrate it can continue to revolutionize browser architecture, even if it doesn’t quite catch up with Safari or the others just yet. But then, it needs to knock Firefox 4.0 into the stratosphere, because in a market that’s becoming more knowledgeable about its environment by the day, consumers don’t place bets on the also-ran for very long.