2007-01-22

Gesture Description Languages for the Wii, iPhone

With the recent introduction of the Nintendo Wii and the great iPhone demo on Macworld, I became really curious on how these new gesture-based UIs can be programmed.

Event-driven UI programming is fairly straight-forward. When the user clicks a button, moves a joystick, or presses the pen down on the surface, your component or event handler receives a message, and usually its type and parameters will be quite straight forward to describe the actual user intention.

I'm sure it's much more difficult with both the Wii controller, which has a large number of sensors to capture your arm gestures in 3d space and time, and with the iPhone, which has a multi-touch interface, which must have a much more varied event model than a straight pen-based touch interface.

On a sensor-level, there's nothing particularly complex, these sensors will produce some sort of digital reading that's ready to be analyzed. However there must be some sort of abstraction layer to make it easier to consume than the raw sensor readings. After all, these input devices (the Wiimote and the iPhone multi-touch digitizer) want to capture gestures, not just a particular change on a sensor. The Wii game developer is interested if the player performed a proper throw in the Baseball game, with the proper aim and force, and the iPhone developer wants to know if the user "pinched" an image on the screen to make it appear smaller.

Relying on individual sensor readings would be horrific in development time, on the other hand just providing a number of simplified events, like "THROW_TO_SCREEN" on the Wii, or "LOWER_CORNER_PINCHED" events on the iPhone would limit what developers can do to take advantage of the new controls.

So I guess Nintendo and Apple has some sort of abstract gesture descriptions that can be mapped to sensor readings, combining the output of the individual sensors (Wiimote held upwards, facing the screen), the changes over time (controller swung within a 2-second range), and the threshold to accommodate small differences (like a kid with smaller hands will produce smaller motion on the controller).

I have no doubt that these gesture-driven user interfaces work well in this context, in fact I absolutely love to play with the Nintendo Wii, and I will likely not be able to resist the iPhone when it comes out. As a developer, I wonder how these new, more complex user input paradigms are available for the developer.