Cornell University Innovates with Wristband Utilizing Echoes and AI for Precise Hand Position Tracking in VR and Beyond
Cornell researchers have developed a wristband device that continuously detects hand positioning – as well as objects the hand interacts with – using AI-powered, inaudible soundwaves.
Potential applications include tracking hand positions for virtual reality (VR) systems, controlling smartphones and other devices with hand gestures, and understanding a user’s activities; for example, a cooking app could narrate a recipe as the user chops, measures and stirs. The technology is small enough to fit onto a commercial smartwatch and lasts all day on a standard smartwatch battery.
EchoWrist is among the newest low-power, body pose-tracking technology from the Smart Computer Interfaces for Future Interactions (SciFi) Lab. Cheng Zhang, assistant professor of information science in the Cornell Ann S. Bowers College of Computing and Information Science directs the lab.
“The hand is fundamentally important – whatever you do almost always involves hands,” Zhang said. “This device offers a solution that can continuously track your hand pose cheaply and also very accurately.”
Chi-Jung Lee and Ruidong Zhang, both doctoral students in the field of information science and co-first authors, will present the study, “EchoWrist: Continuous Hand Pose Tracking and Hand-Object Interaction Recognition Using Low-Power Active Acoustic Sensing On a Wristband,” at the Association of Computing Machinery CHI conference on Human Factors in Computing Systems (CHI’24), May 11-16.
EchoWrist also lets users control devices with gestures and give presentations. “We can enrich our interaction with a smartwatch or even other devices by allowing one-handed interaction – we could also remotely control our smartphone,” said Lee. “I can just use one-handed gestures to control my slides.”
This is the first time the lab has extended its tech beyond the body, said Ruidong Zhang. “EchoWrist not only tracks the hand itself, but also objects and the surrounding environment.”
The device uses two tiny speakers mounted on the top and underside of a wristband to bounce inaudible sound off the hand and any hand-held objects. Two nearby microphones pick up the echoes, which are interpreted by a microcontroller. A battery smaller than a quarter powers the device.
The team developed a type of artificial intelligence model inspired by neurons in the brain, called a neural network, to interpret a user’s hand posture based on the resulting echoes. To train the neural network, they compared echo profiles and videos of users making various gestures and reconstructed the positions of 20 hand joints based on the sound signals.
With help from 12 volunteers, the researchers tested how well EchoWrist detects objects such as a cup, chopsticks, water bottle, pot, pan and kettle, and actions like drinking, stirring, peeling, twisting, chopping and pouring. Overall, the device had 97.6% accuracy. This capability makes it possible for users to follow interactive recipes that track the cook’s progress and read out the next step – so cooks can avoid getting their screens dirty.
Unlike FingerTrak, a previous hand-tracking technology from the SciFi Lab that used cameras, EchoWrist is much smaller and consumes significantly less energy.
“An important added benefit of acoustic tracking is that it really enhances users’ privacy while providing a similar level of performance as camera tracking,” said co-author François Guimbretière, professor of information science in Cornell Bowers CIS and the multicollege Department of Design Tech.
The technology could be used to reproduce hand movements for VR applications. Existing VR and augmented reality systems accomplish this task using cameras mounted on the headset, but this approach uses a lot of power and can’t track the hands once they leave the headset’s limited field of view.
“One of the most exciting applications this technology would enable is to allow AI to understand human activities by tracking and interpreting the hand poses in everyday activities,” Cheng Zhang said.
Researchers noted however, that EchoWrist still struggled to distinguish between objects with highly similar shapes, such as a fork and a spoon. But the team is confident that the object recognition will improve as they refine the technology. With further optimization, they believe EchoWrist could easily be integrated into an existing off-the-shelf smartwatch.
Additional authors on the paper include Devansh Agarwal, an Ignite Fellow at Cornell’s Center for Technology Licensing; Tianhong Catherine Yu, Ke Li and Mose Sakashita, all doctoral students in the field of information science; undergraduates Vipin Gunda ’25, Oliver Lopez ’24 and James Kim ’25; Sicheng Yin, an undergraduate at the University of Edinburgh; and Boao Dong ’23, M.Eng. ’23.
Funding for the project is from the National Science Foundation.