MIT researchers one step closer to device that interprets tone of conversations

This week, researchers from MIT will present their research into a wearable computer that can predict the emotional tone of a conversation, using artificial intelligence. The prospective device could help people with social anxiety and Asperger syndrome.

Device could help people with social anxiety and Asperger syndrome

MIT researchers used a Samsung Simband in a study looking at the viability of an artificially intelligent, wearable system that can predict the tone of conversations. (Jason Dorfman, MIT CSAIL.)

This week the popular TV sitcom The Big Bang Theory toyed with an interesting idea — an emotion-detection device to help the socially inept characters. 

But emotion-detection technology isn't just a sitcom punchline.

This week, researchers from the Massachusetts Institute of Technology (MIT) will present their research into a wearable computer that can predict the emotional tone of a conversation using artificial intelligence at the AAI Conference on Artificial Intelligence in San Francisco. 

Now, you might wonder why you'd want a wearable device to detect the tone of a conversation. Researchers said this system could be useful for people with social anxiety or Asperger syndrome, some of whom have real difficulty with social interaction or picking up on certain non-verbal cues.

The idea is that this device could function as a sort of social coach that could make everyday interactions less challenging for some people.

CBC Radio technology columnist Dan Misener explains how it works.

What does it measure?

The device — which you wear on your wrist like a watch — tracks your conversations and your daily interactions with other people, then classifies those conversations as positive, negative or neutral. So you can look back at the day's interactions and gauge which ones went well and which ones didn't.

It works by measuring a number of different signals. First, it's listening your conversations — to the sound of your voice, and the sound of the person with whom you're interacting. By analyzing the audio, it can pick up on the tone, pitch, and energy of whoever's speaking.

It can also automatically create a written transcript of the conversation, so it can analyze the actual words that are spoken for clues in someone's vocabulary.

And finally, because it's a wearable device, sensors can record different physiological signals, such as movement, heart rate, blood pressure, blood flow and skin temperature.

A graph showing real-time emotion detection. (Jason Dorfman, MIT CSAIL.)

So it's measuring what's said, how it's said and the user's physiological responses.

All of that data is fed into a neural network that's been trained to identify certain cues. For instance, long pauses and monotone voices are usually associated with sad stories. Whereas more energetic and varied speed patterns are associated with happy stories.

If you're wearing the device, you can see the overall sentiment of the conversation as a whole or analyze it in five-second increments, so you can see how the tone of a conversation changes.

How accurate is this piece of wearable tech?

In the study, the system was 83 per cent accurate, according to Mohammad Ghassemi, a PhD candidate at MIT, and one of the researchers who worked on this project.

And accuracy is very important here because the cost of miscategorizing a conversation can be very high, Ghassemi said. 

"Let's say it tells you incorrectly, 'that interaction was horrible' or 'you messed up' [or] 'that was very awkward,' when in fact it wasn't. That could be really, really horrible for a person who has social anxiety or Asperger," Ghassemi said.

The wearable computer was 83 per cent accurate in the MIT study, according to researcher Mohammad Ghassemi. (Jason Dorfman/MIT CSAIL)

If you're relying on a piece of technology to pick up on social cues that you yourself can't, or if you're using it as a kind of social coach to improve your everyday interactions, you really want to be able to trust the system. Especially in high-stake situations, such as a job interview.

What are the privacy implications?

This is an area that gets very thorny, very quickly. Remember, this device only works when it can listen in on conversations, record the audio and then analyze those recordings. That raises some important questions. 

For instance, if you and I are having a conversation out on the street, and I'm using a device like this, am I obliged to tell you about it? Do I need your permission? And how would you react if you knew I was measuring the emotional value of our conversation?

"My own take on it is that you would not take a system like this and record people's interactions without their consent," Ghassemi said. "How you navigate these thorny waters in an important question. We've been focussed a lot more on the technology aspect of it," 

When can we expect them to hit the stores?

We've seen emotion-sensing wearables on TV. But just to be clear, the system Ghassemi worked on at MIT is a research project, not a product. That said, it could be someday.

Ghassemi told me he could take the algorithm they've developed, package it as a smartphone app and release it tomorrow.

But given where the state of the art is right now, he's not sure that would be responsible. If you're going to release something like this to the general public, he said, it should be trained on the largest datasets possible. Not to mention the importance of addressing some of those important questions around recording and consent.

Given that, Ghassemi said he expects we could see mainstream availability of highly accurate emotion-sensing wearables to the general public within the next five to 10 years.


Dan Misener

CBC Radio technology columnist

Dan Misener is a technology journalist for CBC radio and Find him on Twitter @misener.