Why 'gestures' suck

I've not blogged in a while, and though I've said I'd try to make my blog less of a platform for public bitching and whining, I figure it's Christmas, I should get to do what I want. So this is a blog post on why all 'gestures' in applications suck, 'gestures' are always a bad idea and if you're implementing 'gestures' in your application, you're doing it wrong. Of course, this is all my personal opinion and I've done only the most cursory amount of HCI study, so take it with a pitcher of salt.

Great user-interfaces are made great by working on a user's familiarities. This makes a lot of sense. If someone designs an icon to represent an action, they find the nearest every-day analogy that has a clear and identifiable visual, and base it off of that. Mail icons involve envelopes, print icons involve printers, search icons involve magnifying glasses (ok, that last one relies pretty heavily on cultural knowledge which is probably questionable nowadays, but bear with me). This should follow on to all aspects of HCI. People will find things easier if they can apply a skill they already have, or they can relate it to something they're already familiar with.

Touch-screens are becoming a much more common input-device these days, and they're one I've been interested in for a very, very long time. Now that they're becoming more common, more people are trying to retro-fit their applications to work better with this new interface. And this seems to be where 'gestures' come in. People see pinch-to-zoom, or dragging on the iPhone/Pad/Pod (and I'm just going to reference those, as as far as I'm concerned, they're the only devices that have gotten touch-interaction close to being right), and they seem to think "Hey, that's cool, I should put those actions in my application!" STOP.

I have a newsflash - and I'm sure this is just pointless ranting for a lot of people, but I'll say it anyway - pinch-to-zoom and dragging are not 'gestures'. They are physical manipulations that have a logical result. You don't 'execute a pinch-to-zoom gesture' when you zoom in on a web-page or photo on an iPad. You put two fingers on the screen and you move them closer or further apart, because it makes physical sense. When you put your finger on the surface, it responds instantly and with minimal latency - it immediately establishes that placing your finger on this surface attaches your finger to that point on the surface. From there, pinch-to-zoom makes perfect sense and follows logically. These aren't 'gestures', these are direct and logical manipulations of a surface. And that works. Having instant and reliable response to an action is a very powerful device.

If you're a gestures fan, you may now be thinking "Well, the difference is academic, surely?" and I would disagree very strongly with that. A gesture, by definition, is when you make a movement to express an idea. With a gesture, it's ok that you would do one thing, and then, afterwards, something happens. With a gesture, it's ok that whatever gesture you make, what follows may not be directly linked with that gesture. And this is often the feeling you get when you use an application that has 'gestures'. You make a gesture, and then, after the application has considered things, it does something. There is no guarantee that what you do will have an instant and well-defined reaction. And as long as we continue to call these actions 'gestures', this will always be ok, because this is the definition of a gesture. A gesture does not imply any kind of reaction, or make any implications about latency or reliability.

I bring this up now, as my Android phone (see, I'm not an Apple fanboy!) recently updated to the latest Android market, and this is a damn good example of bad HCI (and bad several other things too, but I want to focus my bitching). For those that have the application, open it up and check this out - There's a carousel at the top of the application. You can drag this to scroll it, and when you release, it sort-of maintains your momentum and sets it spinning. Except there's a problem (which is why I said sort-of) - When I drag it, there's no relation between where my finger is and what's under my finger. I'm not physically dragging the carousel, I'm performing a 'drag gesture'. Similarly, when I perform a quick drag gesture and I let go, there's a small pause, and then the carousel starts spinning with the momentum I gave it - except it isn't the momentum I gave it, it's a similar, but not quite right, momentum. The list at the bottom of the application is better (due to it being a stock scrolling widget I imagine), though not much, because they seem to do blocking I/O while you dragging, breaking the direct relation between your physical interaction and the on-screen response.

I don't mean to pick on Android Market especially, as it's something you can see in touch-based interfaces all over the place (Android is bad, but feature-phones are often far worse). But in my eyes, this sort of thing shouldn't be acceptable. Apple proved that it isn't that hard several years ago now - it's not an innovation anymore, someone's gone and done it - we can just copy them!

So, if you have an application that you expect to work on a touch-screen, or you're planning on writing one, think first, "What physical analogy am I making here?" What common familiarity are you taking advantage of? And if your application involves taking advantage of the fact that most people are used to manipulating things with their hands, then do try to realise just how important making the feedback instant, reliable and logical are. Then realise that you must NOT call these physical interactions 'gestures'.

Dylan McCall says:

Thank you, Chris. You have described my feelings about gestures, too. And you have done it far better than I could.

Android is a strange case. Some apps do this beautifully (including many of Google's own), while others are just beyond idiotic. One of my rules is that I will delete, without prejudice, any app that executes a slide movement only after I have moved my finger from one side of the screen to another and lifted it; as if that vast motion is a single event. How a developer could possibly see that as a good idea (or something that isn't a problem) is incredible.
Alas, this has left me without an IRC client on my phone, but I will cope.
As for the Android market: there are /so/ many things wrong with that one carousel…

That's why I was really happy that it seems X's touch input work is moving away from global gestures, towards touch events being interpreted (almost) entirely within each client. There may be hope for us.

I think we can do a lot to facilitate natural touch input that doesn't use gestures, so we still get the shiny APIs without the kludges. For example, a basic verlet cage simulation could be applied towards the classic photo gallery demo where everything is implicitly scalable, rotatable and movable — with momentum and limits — all under the same simple principle. Imagine if everything shared the same simulation code, and different objects across the environment just needed unique weights and friction to behave appropriately…

Having said that, even Apple uses gestures, as I understand it. They just have a really fancy API for them. They have all those patents about them, after all. If we did all of this procedurally (and used nicely placed on-screen controls for things that don't work procedurally, which is a nice indication that they Don't Make Sense) we don't need to think about patents ;)

Adam Williamson says:

I like most of this post, but you damage it by starting off from a really bad example. Icons are horrible UI and should pretty much die and never return. The problem is that they're never, ever, ever universal, and they're misunderstood almost more often than they're misunderstood. You might associate an envelope with mail...but then again, you might also associate it with addresses, and think it's the address book. You might see the address book icon and figure it's a notepad. A printer is pretty much impossible to draw in a reasonable icon size in such a way that anyone actually looks at it and goes 'oh hey, that's a printer'; no-one does that. We all just learned in the normal stupid painful way human brains learn that this particular, more or less arbitrary, grey squiggle is the 'print' icon. It's fucking stupid.

Remember a few years back when Nokia phones had completely icon-based menus? Noticed what all Nokia phones have now? Labels on their bloody icons (except those four on the home screen which people have mostly worked out by now). Notice even the iPhone's 'icon-based' menu interface has a label on every icon - because you will never be able to just figure out what each of those icons actually is. After a while you can launch the app you want without reading off the label but only because you've learned where it is, and your brain has finally remembered to associate this particular squiggle with this particular action; not because of any innate link between the two, but just because you went from that icon to that action enough times for the link to be established.

No, I think probably the best thing Microsoft did for a while was to realize, with the WP7 interface, that icons are stupid and should die. I don't like a lot of other stuff about WP7, but they were bang on with that.

Someone posted a tip to Planet GNOME a while back on how to disable icons just about everywhere in GNOME, and just use text labels instead; I've been doing that on my systems ever since. Saves space and makes it a hell of a lot easier to use, well, just about every app.

Other than that, good article =)

Ortwin says:

I agree with the general point. However, webOS actually managesto get some gestures right. The system wide "back", "forward" and application switching gestures are well done. Other gestures are a mixture of gesture and physical interaction, like opening the launcher drawer or bringing up the quick launch wave. Without this stuff, the beautiful multitasking UI wouldn't be possible. It's much more fun to navigate than the constant tap-tap-tap of the iPhone or most other smartphone interfaces.

Tom Ate says:

Fuck off asshole go do a MSc in Interface design gestures are a really smart way to control electronic devices.

Ben Shneiderman says:

Useful distinction... I think you make a good point... and of course I l like it as it extends the direct manipulation notion nicely.
-- Ben Shneiderman

Crill Byorgson says:

Please post more fried chicken stories. If you don't have any more, then fried turkey stories will be OK thank you.

r0s says:

From the view of an Android user I think the scrolling is more than OK, I like it, I don't see any problem with it.

lamapper says:

I am not UI expert, but even I see the intelligence behind having the movement on the screen stop when your finger stops and the idiocy of having the spinning not start until after your finger has completed the motion.

I do like the idea that Dylan mentioned about giving things "weights" and having them interact based on the movement and those weights...seems so intuitive and obvious. 

I enjoyed your post!

June says:

Tom Ate - I suggest that you do a GCSE in English!

Mike Dobson says:

Personally I think gestures can be useful, sometimes funny, sometimes sad.  They don't have much place in controlling electronic devices because electronic devices are inanimate things with no feelings.

How can an inanimate thing understand gestures. Such a stupid concept, and someone usually a wanker type is going to refer to it as a 'technology'.

Nostrum says:

People who pick their noses and eat the material without first washing their hands are filthy.

Calum says:

"Notice even the iPhone's 'icon-based' menu interface has a label on every icon - because you will never be able to just figure out what each of those icons actually is."

That's not really the whole reason.  For memorability, studies usually show that the combination of icon+label works better than either icons alone OR text alone.  So the combination of both isn't a cop-out, it's genuinely the best solution.

Any comments?