Stephen Shankland, a technology journalist, wrote an August 2010 article for CNet.com regarding the development of emotion recognition technology, which would enable computers to detect what the user is feeling in real time.
The World Wide Web Consortium (the organization which standardizes many Web technologies) is attempting to standardize and formalize emotional states using a vocabulary that computers can handle. This vocabulary is called Emotion Markup Language (EmotionML) and is “designed to provide a more sophisticated alternative to smiley faces and other emoticons…to improve communications between people and computers.”
The engineers behind EmotionML argue that the technology they are developing would avoid any ambiguity in the ways people interact online. Shankland quotes Mark Schroeder, editor of the EmotionML standard, who uses several examples of situations where this technology would be useful: “avatar faces could depict what the person behind them is feeling, the play intensity of computer games could be adjusted based on the player’s reactions, and customer service representatives would be able to tell when their customer is upset”.
However, even Schroeder admits that there are shortcomings in emotion recognition technology, including erroneous readings, which could do more harm than good. This technology is designed to “improve communications,” yet there seems to be the possibility of a large margin of error. Even humans have a difficult time deciphering one another’s emotions, so how can someone program a computer to recognize them accurately? While this sort of technology could be developed to be accurate in reading human emotions a good percentage of the time, but when programming a machine that is designed to do something even humans have trouble with, it’s hard not to question whether or not the margin of error is worth having this technology. Do you think the general public would benefit from using EmotionML?
Furthermore, the engineers that are building the technology in their systems are going by the very general categories of happy, angry, and sad. Research has shown that there are seven basic universal emotions: anger, disgust, contempt, fear, happiness, sadness, and surprise. These emotions are expressed by people all around the world in the same way. Because of this, a computer could quite easily be programmed to detect these emotions. However, there are other emotions that are not expressed the same way all across the board: guilt, pride, love, shame, etc. Because of this, the engineers behind Emotion ML can’t even agree on one vocabulary to use for representing emotions. Therefore, EmotionML is being designed to provide a set of ‘recommended vocabularies,’ and the user must state which vocabulary set they would like to use.
However, if this technology could be implemented in an effective way, it could be helpful in some situations. It would be much easier to convey emotions while instant messaging, for example, since it can be difficult at times to differentiate a joke from a serious statement. Many companies have live customer service online, and it would be so much easier for the customer to convey just how upset they may be. The world of video games would change drastically if games could be programmed to adjust based on the player’s reactions. Furthermore, this technology could be utilized to make law enforcement more effective when they are interviewing a suspect.
Is it really a good idea for the person behind the screen to know every single emotion you are feeling? What about online advertisements? Could a company read your every emotion, and thus cater their advertising to your mood? It would feel a bit like “Big Brother” watching over each computer user, and most people seem to prefer more privacy during online interactions. Do you think the benefits outweigh these negatives?
I think the technology holds promise but I think like many new “shiny” things, it’s most likely to be misused or abused entirely. I liken it to the “lens-flare” effect that was developed and popularized almost to complete extinction back in the early 1990’s when it was used in virtually every piece of marketing or computer-generated graphics and video.
I think the appropriate application for this technology in its early stages is in the entertainment arena. For example, if a video game misinterprets your mood and alters your gameplay negatively, who cares? Just reload or wait 20 seconds for it to make another adjustment. It may get your expression of emotion wrong now and then, but the ever-increasing look of anger of frustration where it shouldn’t fit unless it was wrong should only get easier and easier for a computer to register correctly. All that would be required would be an intelligent algorithm that says “hey, this emotion doesn’t fit with the adjustment we made. Let’s take it as this user hitting ctrl-z and start rolling back the changes we’ve made.” This also allows an easy market in which to test and hone different “emotion interfaces” and see what’s effective with users and what’s ineffective with them. Sort of the same way that real-time 3D acceleration methods were developed, tested, adapted, and finally widely accepted or discarded.
The only other appropriate early-adoption arena I can see is where it would be used simply to augment or back up well-tested existing techniques, such as something a truth wizard or highly skilled psychologist or interviewer would tell you. If the system can reliably return very similar or the same results that a highly skilled human being does, then maybe you’re ready to take it to the next level, but until then, its use remains a liability.
The last place I would want to see it is in customer service, where a customer service representative is probably going to detect the customer’s anger or frustration level more quickly by the words and tone of voice of the customer– unless this would just open that kind of work up to more autistic or otherwise emotionally disabled people or something.
It really should target very high stakes (international espionage level), entertainment, or disabilities first and foremost. Only after it’s been well tested and valid uses found from those should it begin to make its way elsewhere. I think anything else in its infancy is likely to be a misguided application.