All About Performance

and other stuff by Taras Glek

Is Planet Mozilla Obsolete for Technical Content?

Good Old Days

I have been remotely working at Mozilla for over 6 years. I like working remotely, but it poses some challenges. Early on I discovered that if I only show up at the HQ a couple times a year, most will people treat me as a stranger. That got old fast.

The problem is that it takes a lot of time time to get everybody up to speed on who you are (defined by what you work on). This means one’s work social circle is limited to people who you have frequent bugzilla/irc interactions with + random people who took the time to get to know a random coworker. One can imagine that introverts are not inclined to waste too much energy meeting new people.

The solution was simple: blog a lot. After a couple years of blogging I just had to say “I’m Taras” and a good proportion of the people would connect my face to (obscure static analysis at first) work they read about on planet. This cut down my introduction overhead significantly. Planet Mozilla had a lot of blogs syndicated to it when I joined. I had a huge audience to introduce my work to.

In addition to creating awareness of my work, blogging about tough problems would occasionally result in helpful comments. People provided tips on static analysis, Windows APIs and even ran scary privileged software I wrote to help me gather data. Due to disproportionate (eg saving days to weeks of work) value of helpful comments I concluded that it’s worth spending a couple hours per blog post. Most blog comments might be garbage, but they are easy to ignore. Before I implemented telemetry, I was able to find performance extremes solely on blog feedback. Unlike privacy-sensitive telemetry data, blog comments came with email addresses and eager volunteers on the other end. I value comments a lot, it makes me sad when good bloggers disable comments.

To me Planet Mozilla was a great way to keep up with Mozilla technical affairs. We have a lot of smart people working on interesting problems at Mozilla. As a result of past planet experience, I ask every new person who joins the Performance team to get their blog syndicated to planet ASAP. Increasingly that feels like an unproductive suggestion.

Present

I do not have any data on this. However my feeling is that the volume of blog traffic on planet grew from barely-manageable in the early days to too much. Good technical content never constituted more than 10% of the planet posts. However as absolute blog traffic grew, it became harder to spot the good stuff. In addition to a lot of content being non-technical, in the last few years people started discussing their feelings about others and things got ugly.

I’m pretty sure the result is that there are fewer technical people reading planet than before(due to poor signal/noise ratio). Lack of audience means less incentive to blog (that and the fact that some bloggers are part of the audience that gave up on planet).

So what are we to do? Is planet obsolete for good technical content? Is there a new reddit/hackernews/twitter self-moderating solution for dealing with signal problems? Surely setting up a new planet is no longer considered state of the art for this.

I am sad to see a public resource like the planet get too big to remain useful with no clear successor.

ps. Sorry for adding to the non-technical noise.

Snappy #50

Graphics

In some cases Direct2D-accelerated drawing is slower than the non-accelerated path. Jeff Muizelaar fixed a severe gradient ‘hang’ in bug 823147.

Avi Halachmi diagnosed a significant menu performance issue in bug 832641, this was promptly fixed by Matt Woodrow.

Misc Pauses

Vladan Djeric blogged about top main-thread SQL issues contributed by addons. Vladan also produced a chromehang report for last 2 months.

Ehsan Akhgari fixed a chromehang caused by leftover debug code: bug 830765.

Justin Lebar fixed an issue where telemetry memory reporting code was accidentally triggering expensive ‘release memory to OS’ operations: bug 789975.

Shutdown

Sometimes Firefox takes a long time to shutdown. We also have a timer that regularly triggers cycle collection. Olli Pettay disabled this timer during shutdown in bug 822849.

Snappy #48: Now With Faster Shutdown

Huge Shutdown Improvement

After a couple weeks worth of telemetry data confirmed that Olli Pettay sped up shutdown by an epic >=30%: bug 818739, telemetry link.

Memory Management

Olli and Andrew McCreight continued with reducing CC pauses:

  • bug 820378: Delay CC if we’re in the middle of a GC, to allow async CC prep
  • bug 827471: Remove more wrapped JS from the CC graph
  • bug 705371: Remove pointless JSContexts from the CC graph
  • bug 785493: Reduce size of steady state cycle collector graph by about 80%
  • bug 821371: Include prep work in cycle collector pause time telemetry

Misc

Vladan landed bug 807021. Firefox should now handle DOM Local Storage writes without janking.

Startup

David Teller made search service metadata loading/migration async: bug 760036. David also made session-store loading async: bug 532150.

Aaron Klotz landed a telemetry probe to measure how often the ‘Firefox is running but not responding’ dialog is encountered on attempted startup: bug 815418. This will help us decide on whether (or when) to add functionality to kill unresponsive Firefox instances.

Snappy: 2012 Summary

2012 was an exciting year for Snappy. Turning ‘make it go faster’ into a set of measurements and corresponding bugs to fix was hard. We learned a lot.

I’d like to summarize some of the most memorable Snappy accomplishments.

Short version: Firefox is much more reponsive now.

Making Pages Load Faster

I am not a web developer. I often learn about modern web dev tricks/trends by noticing how they impact overall Firefox performance. I prefer learning about perf topics from well-written blog posts. Bryan of Google page speed team blogged on optimizing pageload speeds on mobile. The advice is good, but I have two minor warnings about it.

Suggestion to use requestAnimationFrame to delay loading resources is a good one. There is a gotcha: if you do something expensive in the requestAnimationFrame handler, it’ll delay your first page draw (requestAnimationFrame fires as the browser prepares to paint. It’s an ok place to start network requests, etc). If you do something expensive, use a chained requestAnimationFrame. Firefox recently started using a similar trick to display the UI faster in bug 715402.

The suggestion to split up stylesheets is also good, but risky. I’ve seen this before, but I did not understand why websites sprinkled <link rel="stylesheet"> throughout the page bodies. This can significantly degrade pageload times by causing redundant page restyles and reflows. Reflows can take hundreds of milliseconds on slow mobile devices, doing them multiple times is bad. Make sure to run your pages through a profiler (or time the difference between relevant requestAnimationFrame callbacks). I saw a bad case of this in bug 718864.

Snappy #45: The View From Home

I’m out until January. However, I setup a new blog, so why not test it with a snappy update.

Benoit Girard sped up shutdown with:

  • not forcing startup cache flushes on shutdown: bug 816656. This speeds up exiting browser soon after startup.
  • bug 818296: [Shutdown] js::NukeCrossCompartmentWrappers takes up to 300ms on shutdown. Avoid doing it for optimized shutdown. This may significantly reduce our shutdown times. We are waiting on more telemetry data to confirm.

Aaron Klotz made startup slightly faster by speeding up reading of some urlclassifier files in bug 810101.

Vladimir Vukicevic landed bug 731974 which results in smoother browser animations and significantly improves the quality of tab-strip animations.

Hello Octopress

I’m off work until January. I took this opportunity to partake in chores such as fixing a toilet and switching away from wordpress.

After suffering wordpress for half a decade I finally switched to a combination of Octopress + Disqus.

It took:

  • a few hours of tweaking the combination of exitwp.py, html2text.py to convert my blog without busting links, images
  • a few hours of decyphering octopress/github documentation to setup a website
  • an hour to figure out where images should live (in source/assets)

Thanks to everyone that suggested Octopress.

Goodbye word-style wordpress bitchwork.

Coping With Flash Hangs

Blocking calls into the Flash plugin can temporarily hang Firefox. This is a problem because sometimes the user would be happy to kill the plugin to access their webpage and at other times it’s the only way to get certain flash apps/games to load. If you suffer from flash-related hangs see Aaron’s blog post for some builds to try. He is working a new feature to provide an option to kill hanging flash instances.

Snappy #44: Fixing Tab Switching in Vancouver

I joined our GFX+Layout teams for a workweek in Vancouver. Since profiling is most effective on slow machines, I brought along my trusty Acer  Aspire 722(slow 1.3ghz  CPU+ fast GPU) as my primary laptop. This hardware is great because the combination of a weak CPU + decent GPU means that if we accelerate things right the browser can perform quite well and if we don’t, things get really slow. (analogous situation exists when fast CPUs are matched with slow GPUs).

In the beginning of the week I quickly demoed menu lag, slow gmail tab switching(811472). Later in the week we looked at problematic Facebook tab switch times (811474), Australis(see Matt’s post) performance. By the end of the week tab switching improved by over 2x for both facebook and gmail. I don’t have exact figures because while we can measure general tab switch trends via telemetry, there isn’t a convenient way to do it on individual browsers yet. Help wanted: would be great if someone could do up a barebone addon to monitor tab switching in bug 812381, we’ll fill in the rest.

Jeff Muizelaar started out by speeding up checkbox drawing in bug 809603. Matt Woodrow sped up gmail by tweaking how we use layers in bugs 811927,  811570.

Matt made sure that we no longer draw layers with opacity of 0 in bug 811831. Turns rendering lots of invisible text can be expensive.

Workweeks are a more about communication than getting code landed, so it is impressive that Jeff, Matt and their reviewers managed to diagnose, fix, review, land such significant optimizations in a couple of days. My laptop of pain feels much faster already.

In the coming weeks expect to see smoother tab switching, smoother animations, lower profiling overhead as we work through issues discussed during the workweek.