Saturday, May 3, 2008

Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs

Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs

Song and Kim modify the usual HMM model dividing the observation sequence into block through use of a sliding window. Each block of a gesture is used to train the corresponding HMM. Each HMM is used to recognize partial segments of the gesture over the block (train on [O1], then [o1,o2], etc), and a gesture is recognized through majority voting over the block. They determined that the optimal window size was 3. After the gesture is selected from the set of gesture HMMs it is compared either to a manually set threshold or the output of an HMM train on non-gestures. If the gesture HMM probablity exceeds the threshold or non-gesture HMM it is determined to be a gesture. Testing demonstrates that use of the non-gesture HMM spot gestures more accurately than manual thresholding.

Discussion
The gestures used in the experiment are very simple, mostly lift one arm or the other. Template matching could probably achieve similar results while being much less complex to implement.

Reference
Jinyoung Song, Daijin Kim, "Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs," icpr , pp. 1231-1235, 2006.

No comments: