Tuesday, January 22, 2008

Flexible Gesture Recognition for Immersive Virtual Environments - Deller, Ebert, et al

Summary
Deller et al create a framework for interaction involving a data glove, analagous to the LADDER framework for geometric sketches. They use a P5 data glove for their system, but it is adaptable for any type of hardware. The data glove provides hand position and orientation information as well as finger flexion. Additionally, the glove has several buttons for additional input. Gestures are defined as a sequence of postures and orientations rather than as motions over time. Postures rely mainly on the flexion of the fingers, though orientation may be important as well, therefore posture information contains both flexion and orientation, as well as a relevance value for orientation. As new postures are generated by example, a user simply move thier hand to the correct position to define the posture. Alternately, variations of the posture can be input to create an average posture. Recognition is divided into two phases: data acquistion and gesture management. As the data glove is very noisy, the data must be filtered to obtain adequate values. First a deadband filter is applied, and extreme changes are rejected. Then a dynamic average is taken to smooth the data. Next, matching posture candidates are found from the posture library, and if the posture is held briefly, a PostureChanged event is created. This contains both the previous and current posture as well as position and orientation. Also, GloveMove and ButtonPressed events are created when the glove position changes enough or a button is pressed. Gesture management matches postures data to stored postures by treating flexion values as a five dimensional vector and calculating the closest stored posture. If the posture is close enough to the stored one and the orientation is stasified, it is assigned that posture class. Gestures are defined as a sequence of one or more postures, and the sequence of past postures is matched to possible gestures. The gesture system was demonstrated using a virtual desktop. User natually interacted with the environment, grasping objects by making a fist or pointing at objects to examine them more closely.

Discussion
Though it seems relatively simple, the authors do not test recognition accuracy extensively. Also, their demonstration uses only a handful of postures, all of which would seem to be fairly distinct, making posture recognition easy. It would be more interesting to see how accurate posture recognition is for a more expansive posture data set, such as sign language mentioned by the authors. A more robust posture recognizer may be required in the face of a greater number of possibly ambiguous postures.

Reference
Deller, M., A. Ebert, et al. (2006). Flexible Gesture Recognition for Immersive Virtual Environments. Information Visualization, 2006. IV 2006. Tenth International Conference on.

3 comments:

- D said...

To me this doesn't sound like LADDER. LADDER recognized stuff accurately at the low level, and then put primitives together to form complex shapes. To me, this seems like an implementation of $1 for the glove, possibly a Rubine-type algorithm.

I agree with your analysis of the experimentation. I would like to see them actually test their system and give numbers. I also wonder how the system would do with fluid, moving gestures. Probably not good, since it's built around 'postures'.

Brandon said...

"It would be more interesting to see how accurate posture recognition is for a more expansive posture data set" - It would be interesting to know how accurate the system is already.

As far as "expansive" posture sets go, I don't imagine there can be too many postures since they used a P5 glove (only 5 sensors).

Grandmaster Mash said...

I'd also like to just see the postures they were working. That would at least show me a roughly simple set of postures, and a set of postures that the author deems "natural".