Lorena Anderson, UC Merced
A computer science lab focused on making human-computer interaction easier for people of all abilities has developed a digital lip-reader complete with its own repair system so the software can continue learning from its user.
LipType, a new invention from professor Ahmed Sabbir Arif and his lab, the Human-Computer Interaction Group, lets people send texts or emails on their computers and mobile devices and have contact-free interactions with public devices such as ATMs or other kiosks, without speaking aloud.
There are other lip-readers, but they are not widely used because they are slow and often faulty, Arif said.
“There are a lot of errors in talk-to-text, especially in noisy places, or for people with speech impairments or those who aren’t native speakers,” he said. “But LipType works for anyone. People might need to send a private message while in a public space, or in a meeting, and with LipType, they could just ‘say’ the words without making a sound.”
He and his students added various filters for different lighting conditions and a mistake-corrector based on different language models and they found that LipType was significantly faster and there were fewer errors than with other lip-readers.
To go along with the software testing, Arif’s lab conducted a social study to see if people liked and would use such a technology. They reached out to students and people in the community, including people with disabilities, and conducted an online survey. People who tried it in various tests overwhelmingly say they would use it.
“People with impairments are often concerned about standing out,” Arif said. “This is one way to increase access to mobile and other devices for them in a way that they won’t draw attention to themselves.”
The social study found that people are willing to use silent speech in public places, even when it is not as accurate as other methods. They feel the software preserves their privacy and security and allows them to do what they need to do without disturbing others around them.
Computer Science and Engineering graduate student Laxmi Pandey, who works with Arif, said she is excited about the results of the tests.
“LipType performed 58 percent faster and 53 percent more accurately than other state-of-the-art models in various real-world settings, including poor lighting and busy market situations,” Pandey said. “The success of LipType makes me believe that it can revolutionize our interaction with computer systems and with other human beings as well.”
She and Arif have written papers on the social study and LipType. Both have been accepted for publication and presentation at the 2021 Association for Computing Machinery’s Special Interest Group on Computer–Human Interaction ACM SIGCHI Conference on Human Factors in Computing Systems, the premier international conference on human-computer interaction.
LipType has a number of other applications, as well.
“We were thinking about surveillance situations,” Arif said. “This could be very useful for law enforcement. LipType could be used for closed captioning. We’re also looking at interfaces for cars. "We have a design-for-all philosophy,” Arif said.