Position: Home > News > News > Content

A paper from Tsinghua HCI Group wins CHI 2023 Honorable Mention Award

May 12, 2023 10:09

A paper from Tsinghua HCI Group "Enabling Voice-Accompanying Hand-to-Face Gesture Recognition with Cross-Device Sensing" recently won the CHI 2023 Honorable Mention Award.

Voice interaction is a natural and always-available interaction modality on wearable devices such as headphones and smart watches. Limited by the implicitness of modality information in speech and natural language understanding technology, modality control in speech interaction (such as distinguishing the wake-up state) is still a challenging problem. Users need to repeat wake-up words to actively switch the mode or target device, which brings additional burden to the interaction.



In this paper, the authors investigated voice-accompanying hand-to-face (VAHF) gestures for voice interaction. They targeted on hand-to-face gestures because such gestures relate closely to speech and yield significant acoustic features (e.g., impeding voice propagation). They conducted a user study to explore the design space of VAHF gestures, where they first gathered candidate gestures and then applied a structural analysis to them in different dimensions (e.g., contact position and type), outputting a total of 8 VAHF gestures with good usability and least confusion.


To facilitate VAHF gesture recognition, they proposed a novel cross-device sensing method that leverages heterogeneous channels (vocal, ultrasound, and IMU) of data from commodity devices (earbuds, watches, and rings). Their recognition model achieved an accuracy of 97.3% for recognizing 3 gestures and 91.5% for recognizing 8 VAHF gestures, proving the high applicability.

Finally, they discussed the design space of VAHF gestures, including triggering and interrupting process design through more flexible voice interaction, shortcut key binding, visual information directional binding, etc. They hope that their work can promote more intelligent voice interaction with parallel information such as gestures and body movements as parallel channels.

The authors of this article are Zisu Li, Chen Liang, Yuntao Wang, Yue Qin, Chun Yu, Yukang Yan, Mingming Fan, Yuanchun Shi. If you want to know more about their work, please visit the lab’s webpage (https://pi.cs.tsinghua.edu.cn/) for more details.