I’m starting to type this up as EdgeConf draws to a close. I spoke on the performance panel, with Shane O’Sullivan, Rowan Beentje and Pavel Feldman, moderated by Matt Delaney, and tried to bring a platform perspective to the affair. I found the panel very interesting, and it reminded me how little I know about the high-level of web development. Similarly, I think it also highlighted how little consideration there usually is for the platform when developing for the web. On the whole, I think that’s a good thing (platform details shouldn’t be important, and they have a habit of changing), but a little platform knowledge can help in structuring things in a way that will be fast today, and as long as it isn’t too much of a departure from your design, it doesn’t hurt to think about it. At one point in the panel, I listed a few things that are particularly slow from a platform perspective today. While none of these were intractable problems, they may not be fixed in the near future and feedback indicated that they aren’t all common knowledge. So what follows are a few things to avoid, and a few things to do that will help make your pages scroll smoothly on both desktop and mobile. I’m going to try not to write lies, but I hope if I get anything slightly or totally wrong, that people will correct me in the comments and I can update the post accordingly 🙂
When I mentioned this at the conference, I prefaced it with a quick explanation of how rendering a web page works. It’s probably worth reiterating this. After network and such have happened and the DOM tree has been created, this tree gets translated into what we call the frame tree. This tree is similar to the DOM tree, but it’s structured in a way that better represents how the page will be drawn. This tree is then iterated over and the size and position of these frames are calculated. The act of calculating these positions and sizes is referred to as reflow. Once reflow is done, we translate the frame tree into a display list (other engines may skip this step, but it’s unimportant), then we draw the display list into layers. Where possible, we keep layers around and only redraw parts that have changed/newly become visible.
Really, reflow is actually quite fast, or at least it can be, but it often forces things to be redrawn (and drawing is often slow). Reflow happens when the size or position of things changes in such a way that dependent positions and sizes of elements need to be recalculated. Reflow usually isn’t something that will happen to the entire page at once, but incautious structuring of the page can result in this. There are quite a few things you can do to help avoid large reflows; set widths and heights to absolute values where possible, don’t reposition or resize things, don’t unnecessarily change the style of things. Obviously these things can’t always be avoided, but it’s worth thinking if there are other ways to achieve the result you want that don’t force reflow. If positions of things must be changed, consider using a CSS translate transform, for example – transforms don’t usually cause reflow.
This sounds silly, but you should really only make the browser do as little drawing as is absolutely necessary. Most of the time, drawing will happen on reflow, when new content appears on the screen and when style changes. Some practical advice to avoid this would be to avoid making DOM changes near the root of the tree, avoid changing the size of things and avoid changing text (text drawing is especially slow). While repositioning doesn’t always force redrawing, you can ensure this by repositioning using CSS translate transforms instead of top/left/bottom/right style properties. Especially avoid causing redraws to happen while the user is scrolling. Browsers try their hardest to keep up the refresh rate while scrolling, but there are limits on memory bandwidth (especially on mobile), so every little helps.
Thinking of things that are slow to draw, radial gradients are very slow. This is really just a bug that we should fix, but if you must use CSS radial gradients, try not to change them, or put them in the background of elements that frequently change.
Avoid unnecessary layers
One of the reasons scrolling can be fast at all on mobile is that we reduce the page to a series of layers, and we keep redrawing on these layers down to a minimum. When we need to redraw the page, we just paste these layers that have already been drawn. While the GPU is pretty great at this, there are limits. Specifically, there is a limit to the amount of pixels that can be drawn on the screen in a certain time (fill-rate) – when you draw to the same pixel multiple times, this is called overdraw, and counts towards the fill-rate. Having lots of overlapping layers often causes lots of overdraw, and can cause a frame’s maximum fill-rate to be exceeded.
This is all well and good, but how does one avoid layers at a high level? It’s worth being vaguely aware of what causes stacking contexts to be created. While layers usually don’t exactly correspond to stacking contexts, trying to reduce stacking contexts will often end up reducing the number of resulting layers, and is a reasonable exercise. Even simpler, anything with position: fixed, background-attachment: fixed or any kind of CSS transformed element will likely end up with its own layer, and anything with its own layer will likely force a layer for anything below it and anything above it. So if it’s not necessary, avoid those if possible.
What can we do at the platform level to mitigate this? Firefox already culls areas of a layer that are made inaccessible by occluding layers (at least to some extent), but this won’t work if any of the layers end up with transforms, or aren’t opaque. We could be smarter with culling for opaque, transformed layers, and we could likely do a better job of determining when a layer is opaque. I’m pretty sure we could be smarter about the culling we already do too.
Another thing that slows down drawing is blending. This is when the visual result of an operation relies on what’s already there. This requires the GPU (or CPU) to read what’s already there and perform a calculation on the result, which is of course slower than just writing directly to the buffer. Blending also doesn’t interact well with deferred rendering GPUs, which are popular on mobile.
This alone isn’t so bad, but combining it with text rendering is not so great. If you have text that isn’t on a static, opaque background, that text will be rendered twice (on desktop at least). First we render it on white, then on black, and we use those two buffers to maintain sub-pixel anti-aliasing as the background varies. This is much slower than normal, and also uses up more memory. On mobile, we store opaque layers in 16-bit colour, but translucent layers are stored in 32-bit colour, doubling the memory requirement of a non-opaque layer.
We could be smarter about this. At the very least, we could use multi-texturing and store non-opaque layers as a 16-bit colour + 8-bit alpha, cutting the memory requirement by a quarter and likely making it faster to draw. Even then, this will still be more expensive than just drawing an opaque layer, so when possible, make sure any text is on top of a static, opaque background when possible.
Avoid overflow scrolling
The way we make scrolling fast on mobile, and I believe the way it’s fast in other browsers too, is that we render a much larger area than is visible on the screen and we do that asynchronously to the user scrolling. This works as the relationship between time and size of drawing is not linear (on the whole, the more you draw, the cheaper it is per pixel). We only do this for the content document, however (not strictly true, I think there are situations where whole-page scrollable elements that aren’t the body can take advantage of this, but it’s best not to rely on that). This means that any element that isn’t the body that is scrollable can’t take advantage of this, and will redraw synchronously with scrolling. For small, simple elements, this doesn’t tend to be a problem, but if your entire page is in an iframe that covers most or all of the viewport, scrolling performance will likely suffer.
On desktop, currently, drawing is synchronous and we don’t buffer area around the page like on mobile, so this advice doesn’t apply there. But on mobile, do your best to avoid using iframes or having elements that have overflow that aren’t the body. If you’re using overflow to achieve a two-panel layout, or something like this, consider using position:fixed and margins instead. If both panels must scroll, consider making the largest panel the body and using overflow scrolling in the smaller one.
I hope we’ll do something clever to fix this sometime, it’s been at the back of my mind for quite a while, but I don’t think scrolling on sub-elements of the page can ever really be as good as the body without considerable memory cost.
Take advantage of the platform
This post sounds all doom and gloom, but I’m purposefully highlighting what we aren’t yet good at. There are a lot of things we are good at (or reasonable, at least), and having a fast page need not necessarily be viewed as lots of things to avoid, so much as lots of things to do.
Although computing power continues to increase, the trend now is to bolt on more cores and more hardware threads, and the speed increase of individual cores tends to be more modest. This affects how we improve performance at the application level. Performance increases, more often than not, are about being smarter about when we do work, and to do things concurrently, more than just finding faster algorithms and micro-optimisation.
This relates to the asynchronous scrolling mentioned above, where we do the same amount of work, but at a more opportune time, and in a way that better takes advantage of the resources available. There are other optimisations that are similar with regards to video decoding/drawing, CSS animations/transitions and WebGL buffer swapping. A frequently occurring question at EdgeConf was whether it would be sensible to add ‘hints’, or expose more internals to web developers so that they can instrument pages to provide the best performance. On the whole, hints are a bad idea, as they expose platform details that are liable to change or be obsoleted, but I think a lot of control is already given by current standards.
I hope some of this is useful to someone. I’ll try to write similar posts if I find out more, or there are significant platform changes in the future. I deliberately haven’t mentioned profiling tools, as there are people far more qualified to write about them than I am. That said, there’s a wiki page about the built-in Firefox profiler, some nice documentation on Opera’s debugging tools and Chrome’s tools look really great too.
15 Replies to “Tips for smooth scrolling web pages (EdgeConf follow-up)”
I guess drawing the text on to both a white and black background is for colour emoji alpha recovery support? If text was only monochrome I would have drawn white text on a black background in grey scale and used that as the alpha channel.
It’s for sub-pixel anti-aliasing (so you’re right about the colour) – we don’t do it if only greyscale anti-aliasing is being used, or on mobile. I think/hope.
So are you saying we should use absolute values in our stylesheets for the layout of our Web pages? I’ve always been taught that it’s good practice to use ’em’ values for padding and box widths and whatnot. Is it better to use ‘px’?
em values are absolute too, as long as you don’t change the font size (at which point, they’d need to be recalculated and would cause a reflow). I’m more saying to avoid using percentages, or not specifying width/height at all. The more widths and heights you can explicitly set, the less work that needs to be done during reflow. I would have thought pixels would be cheaper than ems, but not by any appreciable amount, and probably only on an initial layout.
Hey, great post and great discussion, I was there at Edge Conf and found the performance discussion the most interesting for myself. We did a lot of work around using 3d transforms within mobile browsers for our mobile app, utilising hardware acceleration trying to get as close to native performance as possible. We released our mobile ui “Glide” (http://github.com/zestia/glide/) you can see the demo at glide.zestia.com. Our framework doesn’t yet support firefox since we’re wrapping our app in phonegap using the native browser on android and ios being all webkit based. The results are great on the iphone but we still struggle to achieve page transitions or scrolling that doesn’t flicker on android. I assume the flickering has something to do with the topics you discussed above, re-flow and painting. I’m still trying to pin this down but wonder if it’s simply that android devices (namely the latest nexus) aren’t powerful enough? Even the iPhone 3Gs has smoother performance.
Sounds cool, is there a demo I can view online? Not being familiar with WebKit code, I can only conjecture to what the problem was. The flickering sounds like a bug – there are many bug to do with invalidations in a lot of mobile browsers, Android especially. These are bugs where parts of the screen that should redraw don’t (and vice-versa – where parts of the screen that shouldn’t redraw do, wasting time and hurting performance). Android is slowly moving to Chrome as the default browser, which is nice, but doesn’t help apps that want to use web components for interaction. I don’t know what their plans are with regards to this, but there are people here looking into making Gecko an embeddable widget and we have web-app support on Android, so I’d suggest having a look at us 🙂
With regards to the class you add the transform to affecting the performance, this could either be a bug (we’ve had bugs with nested transforms and with indirect descendants of transformed frames at various points, for example), or it could be that the outer element that you’re applying the transform to has a size or position that when changed, causes reflow. If you disable overflow on body, you might find that the performance increases, as the browser is no longer obligated to calculate the size of the document to present scrollbars (Shane mentioned this during the panel). The timeout thing I’m unsure of – my best guess would be that the timeout is getting run at a time where the work that happens at the very beginning of the transition is either less expensive, or happens earlier, reducing hitching at the start of the animation.
Thanks for your comments. We have a small demo at http://glide.zesita.com if you want to check it out.
Cool, didn’t know about firefox web app support will take a look! Might help trying to smooth out browser inconsistencies across android versions/devices.
Your comment about the transform performance is interesting. We’re using box flex which I assume now is re-flowing when the transition is finished which could be the cause of flickering. The overflow on the body looks worth experimenting with, thanks for the tips! Would be great if there were more resources out there to read about the finer details of browser rendering engines. The depth of knowledge at edge conf was great to see.
It would seem rem should be a unit that browser makers would like. From the above comments I would think it advantageous since rem always relates to base font size and doesn’t inherit text size changes from parent elements. Yet, it’s browser support seems really poor. Currently I can only reliably use rem for Firefox. Effectively, that means I can’t use it.
Aside from text sizes (fonts), rems also fixes border, margin, and padding to a fixed value. It would seem that should help with rendering by cutting down on the calculations involved in redraws and repaints. But, Webkit and Presto (and of course IE) don’t seem in any hurry to adopt them.
For a developer, it makes things much easier for layout whenever there’s a constant to use in measurements across the page. I tried getting away from pixels altogether (especially with all the different pixel densities now available in devices they’re just a bad idea for layout/sizing). Rems fix all that at a document wide level. Ems are great, but having them relative to the size of the parent makes them an unreliable/variable measurement unit. This may have some utility in special circumstances, but in general is not good as a document–wide reference measurement.
Rems are good for sizing to be done in a way that allows for proper rendering of element sizes across all platforms in a more consistent way. Users control their base font sizes. Rems honor those choices and therefore are better at rendering text and other content at sizes the user has shown a preference for. And, they maintain a consistent relationship to the base font across the entire document which allows for consistently sized margins and paragraph text.
I think this should be mandatory reading for anyone who calls themselves a ‘web designer’, especially any who largely deal with static layouts. As you point out, a web page isn’t static and it’s easy to forget that updates take CPU cycles.
Really good write-up, thank you!
Hi, awesome informations, thanks a lot for sharing 🙂
Just to clarify. I have 2 situations.
First let’s say I have a div in position:absolute top:0 bottom:0 left:0 right:0 overflow:auto (it makes this div able to adjust itself to any parent size in width AND height, fully liquid). Also note that this div could be near the root of the DOM. Good or bad idea?
Second, I made a jsfiddle, it would have been impossible to explain 🙂 This situation uses kind of the same technique but only with top bottom and right. Again it’s a liquid/responsive situation.
http://jsfiddle.net/jkneb/judNR/ And again, good or bad idea?
1- This isn’t necessarily bad – position:absolute elements will end up using placeholder frames, so its reflow shouldn’t affect the rest of the document (ancestrally) above it. On the other hand, it may not get its own layer, in which case changing the width and height will cause whatever’s underneath it to also get re-rendered whenever its size changes. Constantly adjusting the width and height of a div is not so great either, as that will reflow that div and probably cause the entire div to be re-rendered. If it’s only small, this isn’t too big a deal, but does it really need to change width/height to do what it’s doing?
2- This will be ok as it’s a solid colour, but if it had text in it, or a background, this will likely cause the entire bar to be redrawn on each change. It really depends on what your hard limits are here – If you can get away with it, doing this same thing but using a CSS scale transformation instead of left/top/right/bottom will be faster, especially on mobile. If you had a 9-patch texture, or something similar, there’s really not much else you could do bar using canvas or WebGL, but that would obviously be overkill 🙂
Hope that helps!
Ok thanks a lot! Hard to decide then. Of course in case number 2 the pink column will host a lot elements as it would be the “website sidebar”. Did you try an “off canvas” approach? It seems they use it in Foundation http://foundation.zurb.com/off-canvas.php