Tuesday, June 25, 2024

They’re helping their brother use his voice through AI | CBC News

Must read

On his YouTube channel, Anand Munje interviews entrepreneurs, academics and artists.

He communicates with them even though his cerebral palsy makes it hard to speak, but it’s possible thanks to the software his brothers Arun and Amit Munje developed — AIHEARU. 

It’s pronounced “I hear you,” and it takes Anand’s words and turns them into text on screen during his show Anand’s World.

The brothers integrated AIHEARU into Zoom’s video conferencing, and three dots appear on screen as the software processes Anand’s voice and then displays it as text.

“Talking to people with my own voice without the help of Amit is new to me,” he said in the episode where he interviewed his brothers. Amit has often acted as Anand’s interpreter.

For his channel, Anand prepares and practices his questions. This allows him to fine tune the vocabulary the software uses to interpret his voice. 

CBC News met Arun at home in the Ottawa suburb of Kanata for a video interview with Anand in India.

In a departure from how CBC usually conducts interviews, Anand received some questions about a day ahead so he could prepare answers in the same way he does for the show. 

“The best part is, it can recognize my voice,” he said in one of his prepared answers. “I hope the app can be used by millions of speech-impaired people. I wish others can be independent to talk.” 

We also tried the “open mode” to cover some questions that he wasn’t given in advance.

“I am very nervous,” he said through the software in open mode. “This is the first time that I’m using open mode.”

Arun, who previously worked at a teleconferencing startup and founded AIHEARU as a side project, explained “open mode” attempts to match Anand’s words with a massive data set of words.

“This is complete open vocabulary. So that’s why there might be some errors or more errors showing up,” Arun said. “It doesn’t know the context [of] what we are talking about.”

Anand Munje speaks using the Aihearu interface integrated with Zoom. The software displays his words and shows an elipsis as it process his speech. (Zoom)

Open mode did make some mistakes and Arun had to help interpret, a sign of work to come. For example, “it’s 10 p.m.” became “it’s a poem.” 

Anand admits he sometimes finds managing the vocabulary of the software a challenge.

But while there may be challenges with more spontaneous lengthy conversations, Anand is able to use it for regular activities like managing smart home devices.

Arun said artificial intelligence allows them to build and refine the model that recognizes Anand’s speech faster, rather than having to “hand-feed” his vocal patterns into the program.

He said the motivation was to help his brother express himself, but he hopes AIHEARU will be able to help people with different speech impairments or even thick accents.

“What we did was get Anand to use it so we get first-hand experience of all the problems and how we can overcome them,” Arun said.

‘Lead a more fulfilled life’

Claire Davies says people with speech impairment need more options to be able to express themselves individually and participate in society.

“The more that’s out there, the more we can enable other people to interact with people who have speech impairments,” said Davies, who runs the Building and Designing Assistive Technology Lab at Queen’s University. 

“That allows them to get jobs, it allows them to interact in social environments, it allows them to work with their co-workers and it enables them to actually lead a more fulfilled life.”

Davies said it’s especially valuable that the technology is coming from the family’s lived experience and their understanding of how Anand expresses himself.

“They know best how to interpret that information and that enables the software to learn more effectively about what’s actually being said,” she said. 

She said their work could also benefit other users.

Claire Davies, associate professor at Queen's University, works in the Building and Designing Assistive Technology Lab at the Department of Mechanical and Materials Engineering.
Claire Davies, associate professor at Queen’s University, says people with speech impairment need more options to be able to express themselves. (Submitted/Claire Davies)

In open mode, Anand was able to respond to a question about how he feels about his brothers’ hard work.

“I am very grateful and very proud of my brother[s],” Anand said through the app. 

Arun was momentarily taken aback that the software produced that result.

“I myself get impressed when I see it catch [phrases] in that open mode because it doesn’t have anything fed into it.”

Arun said they’re testing AIHEARU with a handful of other people with speech impairment through a charity, but each user needs to be set up manually.

“The next step would be to make it easily self-used by anyone, so they can … download and use it.”

Latest article