Vocal Emotion Detector

Two high school juniors from Oregon have won $100,000 in the 2010 Siemens Competition in Math, Science, and Technology. Disabilityscoop website reported commented on the students’, Akash Krishnan and Matthew Fernandez, computer analysis program that helps distinguish among five emotions from the recording of the human voice.

Their project “The Recognition of Emotion in Human Speech” is a much updated version of the Voice Stress Analysis gauge (VSA), which was commercially developed in 1970 to detect deception from deviations of a base normal voice stress level. The theory behind VSA is that physiological stress induced by lying causes inaudible fluctuations in the larynx. These fluctuations are detected by a voice analyzer and then processed by a computer.

Unlike the VSA, the students’ program does not try to detect deception but relies on an emotional speech database with 18, 215 files to distinguish between five emotions, which they labeled as angry, sad (empathetic), happy, neutral, and rest.

On an npr radio interview Matthew Fernandez affirmed that their computer program utilizes the frequencies and energies in voice and tries to recognize what emotions are being spoken. “We train our system using a bunch of audio (57 different audio features to be exact) that already been defined as either actors speaking angrily or happily. And then when we get a new signal, we can compare it best with what we know about each of the energies and frequencies of the new signal.

Krishnan and Fernandez were able to achieve 90% accuracy in detecting happy and sad voices and 60% accuracy for their technology as a whole. Previous research in this field had only resulted in 41 percent accuracy.

The teens were inspired by the movie I, Robot from a scene where the robot recognizes that its “user” is afraid and that it can protect him. Akash and Matthew see many applications in the future for their program such as in call centers, virtual computer games, and possibly in lie detection.

What the high school students want most is to help children with autism who struggle to process the emotions of others. They would like to create a wristwatch device that would use a happy face, sad face, angry face to display the emotions that are being spoken around them. This could assist children with autism in interacting with their peers.

If you would like to hear more about the students story, check out the entire radio interview:

Leave a Reply