OffscreenCanvas update

Hold up, a blog post before a year’s up? I’d best slow down, don’t want to over-strain myself 🙂 So, a year ago, OffscreenCanvas was starting to become usable but was missing some key features, such as asynchronous updates and text-related functions. I’m pleased to say that, at least for Linux, it’s been complete for quite a while now! It’s still going to be a while, I think, before this is a truly usable feature in every browser. Gecko support is still forthcoming, support for non-Linux WebKit is still off by default and I find it can be a little unstable in Chrome… But the potential is huge, and there are now double the number of independent, mostly-complete implementations that prove it’s a workable concept.

Something I find I’m guilty of, and I think that a lot of systems programmers tend to be guilty of, is working on a feature but not using that feature. With that in mind, I’ve been spending some time in the last couple of weeks to try and bring together demos and information on the various features that the WebKit team at Igalia has been working on. With that in mind, I’ve written a little OffscreenCanvas demo. It should work in any browser, but is a bit pointless if you don’t have OffscreenCanvas, so maybe spin up Chrome or a canary build of Epiphany.

OffscreenCanvas fractal renderer demo, running in GNOME Web Canary

Those of us old-skool computer types probably remember running fractal renderers back on their old home computers, whatever they may have been (PC for me, but I’ve seen similar demos on Amigas, C64s, Amstrad CPCs, etc.) They would take minutes to render a whole screen. Of course, with today’s computing power, they are much faster to render, but they still aren’t cheap by any stretch of the imagination. We’re talking 100s of millions of operations to render a full-HD frame. Running on the CPU on a single thread, this is still something that isn’t really real-time, at least implemented naively in JavaScript. This makes it a nice demonstration of what OffscreenCanvas, and really, Worker threads allow you to do without too much fuss.

The demo, for which you can look at my awful code, splits that rendering into 64 tiles and gives each tile to the first available Worker in a pool of rendering threads (different parts of the fractal are much more expensive to render than others, so it makes sense to use a work queue, rather than just shoot them all off distributed evenly amongst however many Workers you’re using). Toggle one of the animation options (palette cycling looks nice) and you’ll get a frame-rate counter in the top-right, where you can see the impact on performance that adding Workers can have. In Chrome, I can hit 60fps on this 40-core Xeon machine, rendering at 1080p. Just using a single worker, I barely reach 1fps (my frame-rates aren’t quite as good in WebKit, I expect because of some extra copying – there are some low-hanging fruit around OffscreenCanvas/ImageBitmap and serialisation when it comes to optimisation). If you don’t have an OffscreenCanvas-capable browser (or a monster PC), I’ve recorded a little demonstration too.

The important thing in this demo is not so much that we can render fractals fast (this is probably much, much faster to do using WebGL and shaders), but how easy it is to massively speed up a naive implementation with relatively little thought. Google Maps is great, but even on this machine I can get it to occasionally chug and hitch – OffscreenCanvas would allow this to be entirely fluid with no hitches. This becomes even more important on less powerful machines. It’s a neat technology and one I’m pleased to have had the opportunity to work on. I look forward to seeing it used in the wild in the future.

OffscreenCanvas, jobs, life

Hoo boy, it’s been a long time since I last blogged… About 2 and a half years! So, what’s been happening in that time? This will be a long one, so if you’re only interested in a part of it (and who could blame you), I’ve titled each section.

Leaving Impossible

Well, unfortunately my work with Impossible ended, as we essentially ran out of funding. That’s really a shame, we worked on some really cool, open-source stuff, and we’ve definitely seen similar innovations in the field since we stopped working on it. We took a short break (during which we also, unsuccessfully, searched for further funding), after which Rob started working on a cool, related project of his own that you should check out, and I, being a bit less brave, starting seeking out a new job. I did consider becoming a full-time musician, but business wasn’t picking up as quickly as I’d hoped it might in that down-time, and with hindsight, I’m glad I didn’t (Covid-19 and all).

I interviewed with a few places, which was certainly an eye-opening experience. The last ‘real’ job interview I did was for Mozilla in 2011, which consisted mainly of talking with engineers that worked there, and working through a few whiteboard problems. Being a young, eager coder at the time, this didn’t really phase me back then. Turns out either the questions have evolved or I’m just not quite as sharp as I used to be in that very particular environment. The one interview I had that involved whiteboard coding was a very mixed bag. It seemed a mix of two types of questions; those that are easy to answer (but unless you’re in the habit of writing very quickly on a whiteboard, slow to write down) and those that were pretty impossible to answer without specific preparation. Perhaps this was the fault of recruiters, but you might hope that interviews would be catered somewhat to the person you’re interviewing, or the work they might actually be doing, neither of which seemed to be the case? Unsurprisingly, I didn’t get past that interview, but in retrospect I’m also glad I didn’t. Igalia’s interview process was much more humane, and involved mostly discussions about actual work I’ve done, hypothetical situations and ethics. They were very long discussions, mind, but I’m very glad that they were happy to hire me, and that I didn’t entertain different possibilities. If you aren’t already familiar with Igalia, I’d highly recommend having a read about them/us. I’ve been there a year now, and the feeling is quite similar to when I first joined Mozilla, but I believe with Igalia’s structure, this is likely to stay a happier and safer environment. Not that I mean to knock Mozilla, especially now, but anyone that has worked there will likely admit that along with the giddy highs, there are also some unfortunate lows.

Igalia

I joined Igalia as part of the team that works on WebKit, and that’s what I’ve been doing since. It almost makes perfect sense in a way. Surprisingly, although I’ve spent overwhelmingly more time on Gecko, I did actually work with WebKit first while at OpenedHand, and for a short period at Intel. While celebrating my first commit to WebKit, I did actually discover it wasn’t my first commit at all, but I’d contributed a small embedding-related fix-up in 2008. So it’s nice to have come full-circle! My first work at Igalia was fixing up some patches that Žan DoberÅ¡ek had prototyped to allow direct display of YUV video data via pixel shaders. Later on, I was also pleased to extend that work somewhat by fixing some vc3 driver bugs and GStreamer bugs, to allow for hardware decoding of YUV video on Raspberry Pi 3b (this, I believe, is all upstream at this point). WebKit Gtk and WPE WebKit may be the only Linux browser backends that leverage this pipeline, allowing for 1080p30 video playback on a Pi3b. There are other issues making this less useful than you might think, but either way, it’s a nice first achievement.

OffscreenCanvas

After that introduction, I was pointed at what could be fairly described as my main project, OffscreenCanvas. This was also a continuation of Žan’s work (he’s prolific!), though there has been significant original work since. This might be the part of this post that people find most interesting or relevant, but having not blogged in over 2 years, I can’t be blamed for waffling just a little. OffscreenCanvas is a relatively new web standard that allows the use of canvas API disconnected from the DOM, and within Workers. It also makes some provisions for asynchronously updated rendering, allowing canvas updates in Workers to bypass the main thread entirely and thus not be blocked by long-running processes on that thread. The most obvious use-case for this, and I think the most practical, is essentially non-blocking rendering of generated content. This is extremely handy for maps, for example. There are some other nice use-cases for this as well – you can, for example, show loading indicators that don’t stop animating while performing complex DOM manipulation, or procedurally generate textures for games, asynchronously. Any situation where you might want to do some long-running image processing without blocking the main thread (image editing also springs to mind).

Currently, the only complete implementation is within Blink. Gecko has a partial implementation that only supports WebGL contexts (and last time I tried, crashed the browser on creation…), but as far as I know, that’s it. I’ve been working on this, with encouragement and cooperation from Apple, on and off for the past year. In fact, as of August 12th, it’s even partially usable, though there is still a fair bit missing. I’ve been concentrating on the 2d context use-case, as I think it’s by far the most useful part of the standard. It’s at the point where it’s mostly usable, minus text rendering and minus some edge-case colour parsing. Asynchronous updates are also not yet supported, though I believe that’s fairly close for Linux. OffscreenCanvas is enabled with experimental features, for those that want to try it out.

My next goal, after asynchronous updates on Linux, is to enable WebGL context support. I believe these aren’t particularly tough goals, given where it is now, so hopefully they’ll happen by the end of the year. Text rendering is a much harder problem, but I hope that between us at Igalia and the excellent engineers at Apple, we can come up with a plan for it. The difficulty is that both styling and font loading/caching were written with the assumption that they’d run on just one thread, and that that thread would be the main thread. A very reasonable assumption in a pre-Worker and pre-many-core-CPU world of course, but increasingly less so now, and very awkward for this particular piece of work. Hopefully we’ll persevere though, this is a pretty cool technology, and I’d love to contribute to it being feasible to use widely, and lessen the gap between native and the web.

And that’s it from me. Lots of non-work related stuff has happened in the time since I last posted, but I’m keeping this post tech-related. If you want to hear more of my nonsense, I tend to post on Twitter a bit more often these days. See you in another couple of years 🙂

My Impossible Story

Keeping up my bi-yearly blogging cadence, I thought it might be fun to write about what I’ve been doing since I left Mozilla. It’s also a convenient time, as it coincides with our work being open-sourced and made public (and of course, developed in public, because otherwise what’s the point, right?) Somewhat ironically, I’ve been working on another machine-learning project, though I’m loathe to call it that, as it uses no neural networks so far, and most people I’ve encountered consider those to be synonymous. I did also go on a month’s holiday to the home of bluegrass music, but that’s a story for another post. I’m getting ahead of myself here.

Some time in March I met up with some old colleagues/friends and of course we all got to chatting about what we’re working on at the moment. As it happened, Rob had just started working at a company run by a friend of our shared former boss, Matthew Allum. What he was working on sounded like it would be a lot of fun, and I had to admit that I was a little jealous of the opportunity… But it so happened that they were looking to hire, and I was starting to get itchy feet, so I got to talk to Kwame Ferreira and one thing lead to another.

I started working for Impossible Labs in July, on an R&D project called ‘glimpse’. The remit for this work hasn’t always been entirely clear, but the pitch was that we’d be working on augmented reality technology to aid social interaction. There was also this video:

How could I resist?

What this has meant in real terms is that we’ve been researching and implementing a skeletal tracking system (think motion capture without any special markers/suits/equipment). We’ve studied Microsoft’s freely-available research on the skeletal tracking system for the Kinect, and filling in some of the gaps, implemented something that is probably very similar. We’ve not had much time yet, but it does work and you can download it and try it out now if you’re an adventurous Linux user. You’ll have to wait a bit longer if you’re less adventurous or you want to see it running on a phone.

I’ve worked mainly on implementing the tools and code to train and use the model we use to interpret body images and infer joint positions. My prior experience on the DeepSpeech team at Mozilla was invaluable to this. It gave me the prerequisite knowledge and vocabulary to be able to understand the various papers around the topic, and to realistically implement them. Funnily, I initially tried using TensorFlow for training, with the mind that it’d help us to easily train on GPUs. It turns out re-implementing it in native C was literally 1000x faster and allowed us to realistically complete training on a single (powerful) machine, in just a couple of days.

My take-away for this is that TensorFlow isn’t necessarily the tool for all machine-learning tasks, and also to make sure you analyse the graphs that it produces thoroughly and make sure you don’t have any obvious bottlenecks. A lot of TensorFlow nodes do not have GPU implementations, for example, and it’s very easy to absolutely kill performance by requiring frequent data transfers to happen between CPU and GPU. It’s also worth noting that a large graph has a huge amount of overhead that will be unrelated to the actual operations you’re trying to run. I’m no TensorFlow expert, but it’s definitely a particular tool for a particular job and it’s worth being careful. Experts can feel free to look at our repository history and tell me all the stupid mistakes I was making before we rewrote it 🙂

So what’s it like working at Impossible on a day-to-day basis? I think a picture says a thousand words, so here’s a picture of our studio:

Though I’ve taken this from the Impossible website, this is seriously what it looks like. There is actually a piano there, and it’s in tune and everything. There are guitars. We have a cat. There’s a tree. A kitchen. The roof is glass. As amazing as Mozilla (and many of the larger tech companies) offices are, this is really something else. I can’t overstate how refreshing an environment this is to be in, and how that impacts both your state of mind and your work. Corporations take note, I’ll take sunlight and life over snacks and a ball-pit any day of the week.

I miss my 3-day work-week sometimes. I do have less time for music than I had, and it’s a little harder to fit everything in. But what I’ve gained in exchange is a passion for my work again. This is code I’m pretty proud of, and that I think is interesting. I’m excited to see where it goes, and to get it into people’s hands. I’m hoping that other people will see what I see in it, if not now, sometime in the near future. Wish us luck!

Goodbye Mozilla

Today is effectively my last day at Mozilla, before I start at Impossible on Monday. I’ve been here for 6 years and a bit and it’s been quite an experience. I think it’s worth reflecting on, so here we go; Fair warning, if you have no interest in me or Mozilla, this is going to make pretty boring reading.

I started on June 6th 2011, several months before the (then new, since moved) London office opened. Although my skills lay (lie?) in user interface implementation, I was hired mainly for my graphics and systems knowledge. Mozilla was in the region of 500 or so employees then I think, and it was an interesting time. I’d been working on the code-base for several years prior at Intel, on a headless backend that we used to build a Clutter-based browser for Moblin netbooks. I wasn’t completely unfamiliar with the code-base, but it still took a long time to get to grips with. We’re talking several million lines of code with several years of legacy, in a language I still consider myself to be pretty novice at (C++).

I started on the mobile platform team, and I would consider this to be my most enjoyable time at the company. The mobile platform team was a multi-discipline team that did general low-level platform work for the mobile (Android and Meego) browser. When we started, the browser was based on XUL and was multi-process. Mobile was often the breeding ground for new technologies that would later go on to desktop. It wasn’t long before we started developing a new browser based on a native Android UI, removing XUL and relegating Gecko to page rendering. At the time this felt like a disappointing move. The reason the XUL-based browser wasn’t quite satisfactory was mainly due to performance issues, and as a platform guy, I wanted to see those issues fixed, rather than worked around. In retrospect, this was absolutely the right decision and lead to what I’d still consider to be one of Android’s best browsers.

Despite performance issues being one of the major driving forces for making this move, we did a lot of platform work at the time too. As well as being multi-process, the XUL browser had a compositor system for rendering the page, but this wasn’t easily portable. We ended up rewriting this, first almost entirely in Java (which was interesting), then with the rendering part of the compositor in native code. The input handling remained in Java for several years (pretty much until FirefoxOS, where we rewrote that part in native code, then later, switched Android over).

Most of my work during this period was based around improving performance (both perceived and real) and fluidity of the browser. Benoit Girard had written an excellent tiled rendering framework that I polished and got working with mobile. On top of that, I worked on progressive rendering and low precision rendering, which combined are probably the largest body of original work I’ve contributed to the Mozilla code-base. Neither of them are really active in the code-base at the moment, which shows how good a job I didn’t do maintaining them, I suppose.

Although most of my work was graphics-focused on the platform team, I also got to to do some layout work. I worked on some over-invalidation issues before Matt Woodrow’s DLBI work landed (which nullified that, but I think that work existed in at least one release). I also worked a lot on fixed position elements staying fixed to the correct positions during scrolling and zooming, another piece of work I was quite proud of (and probably my second-biggest contribution). There was also the opportunity for some UI work, when it intersected with platform. I implemented Firefox for Android’s dynamic toolbar, and made sure it interacted well with fixed position elements (some of this work has unfortunately been undone with the move from the partially Java-based input manager to the native one). During this period, I was also regularly attending and presenting at FOSDEM.

I would consider my time on the mobile platform team a pretty happy and productive time. Unfortunately for me, those of us with graphics specialities on the mobile platform team were taken off that team and put on the graphics team. I think this was the start in a steady decline in my engagement with the company. At the time this move was made, Mozilla was apparently trying to consolidate teams around products, and this was the exact opposite happening. The move was never really explained to me and I know I wasn’t the only one that wasn’t happy about it. The graphics team was very different to the mobile platform team and I don’t feel I fit in as well. It felt more boisterous and less democratic than the mobile platform team, and as someone that generally shies away from arguments and just wants to get work done, it was hard not to feel sidelined slightly. I was also quite disappointed that people didn’t seem particular familiar with the graphics work I had already been doing and that I was tasked, at least initially, with working on some very different (and very boring) desktop Linux work, rather than my speciality of mobile.

I think my time on the graphics team was pretty unproductive, with the exception of the work I did on b2g, improving tiled rendering and getting graphics memory-mapped tiles working. This was particularly hard as the interface was basically undocumented, and its implementation details could vary wildly depending on the graphics driver. Though I made a huge contribution to this work, you won’t see me credited in the tree unfortunately. I’m still a little bit sore about that. It wasn’t long after this that I requested to move to the FirefoxOS systems front-end team. I’d been doing some work there already and I’d long wanted to go back to doing UI. It felt like I either needed a dramatic change or I needed to leave. I’m glad I didn’t leave at this point.

Working on FirefoxOS was a blast. We had lots of new, very talented people, a clear and worthwhile mission, and a new code-base to work with. I worked mainly on the home-screen, first with performance improvements, then with added features (app-grouping being the major one), then with a hugely controversial and probably mismanaged (on my part, not my manager – who was excellent) rewrite. The rewrite was good and fixed many of the performance problems of what it was replacing, but unfortunately also removed features, at least initially. Turns out people really liked the app-grouping feature.

I really enjoyed my time working on FirefoxOS, and getting a nice clean break from platform work, but it was always bitter-sweet. Everyone working on the project was very enthusiastic to see it through and do a good job, but it never felt like upper management’s focus was in the correct place. We spent far too much time kowtowing to the desires of phone carriers and trying to copy Android and not nearly enough time on basic features and polish. Up until around v2.0 and maybe even 2.2, the experience of using FirefoxOS was very rough. Unfortunately, as soon as it started to show some promise and as soon as we had freedom from carriers to actually do what we set out to do in the first place, the project was cancelled, in favour of the whole Connected Devices IoT debacle.

If there was anything that killed morale for me more than my unfortunate time on the graphics team, and more than having FirefoxOS prematurely cancelled, it would have to be the Connected Devices experience. I appreciate it as an opportunity to work on random semi-interesting things for a year or so, and to get some entrepreneurship training, but the mismanagement of that whole situation was pretty epic. To take a group of hundreds of UI-focused engineers and tell them that, with very little help, they should organised themselves into small teams and create IoT products still strikes me as an idea so crazy that it definitely won’t work. Certainly not the way we did it anyway. The idea, I think, was that we’d be running several internal start-ups and we’d hopefully get some marketable products out of it. What business a not-for-profit company, based primarily on doing open-source, web-based engineering has making physical, commercial products is questionable, but it failed long before that could be considered.

The process involved coming up with an idea, presenting it and getting approval to run with it. You would then repeat this approval process at various stages during development. It was, however, very hard to get approval for enough resources (both time and people) to finesse an idea long enough to make it obviously a good or bad idea. That aside, I found it very demoralising to not have the opportunity to write code that people could use. I did manage it a few times, in spite of what was happening, but none of this work I would consider myself particularly proud of. Lots of very talented people left during this period, and then at the end of it, everyone else was laid off. Not a good time.

Luckily for me and the team I was on, we were moved under the umbrella of Emerging Technologies before the lay-offs happened, and this also allowed us to refocus away from trying to make an under-featured and pointless shopping-list assistant and back onto the underlying speech-recognition technology. This brings us almost to present day now.

The DeepSpeech speech recognition project is an extremely worthwhile project, with a clear mission, great promise and interesting underlying technology. So why would I leave? Well, I’ve practically ended up on this team by a series of accidents and random happenstance. It’s been very interesting so far, I’ve learnt a lot and I think I’ve made a reasonable contribution to the code-base. I also rewrote python_speech_features in C for a pretty large performance boost, which I’m pretty pleased with. But at the end of the day, it doesn’t feel like this team will miss me. I too often spend my time finding work to do, and to be honest, I’m just not interested enough in the subject matter to make that work long-term. Most of my time on this project has been spent pushing to open it up and make it more transparent to people outside of the company. I’ve added model exporting, better default behaviour, a client library, a native client, Python bindings (+ example client) and most recently, Node.js bindings (+ example client). We’re starting to get noticed and starting to get external contributions, but I worry that we still aren’t transparent enough and still aren’t truly treating this as the open-source project it is and should be. I hope the team can push further towards this direction without me. I think it’ll be one to watch.

Next week, I start working at a new job doing a new thing. It’s odd to say goodbye to Mozilla after 6 years. It’s not easy, but many of my peers and colleagues have already made the jump, so it feels like the right time. One of the big reasons I’m moving, and moving to Impossible specifically, is that I want to get back to doing impressive work again. This is the largest regret I have about my time at Mozilla. I used to blog regularly when I worked at OpenedHand and Intel, because I was excited about the work we were doing and I thought it was impressive. This wasn’t just youthful exuberance (he says, realising how ridiculous that sounds at 32), I still consider much of the work we did to be impressive, even now. I want to be doing things like that again, and it feels like Impossible is a great opportunity to make that happen. Wish me luck!