OnQ Blog

Humans can barely understand emojis. Will machines do any better?

Sep 18, 2015

Qualcomm products mentioned within this post are offered by Qualcomm Technologies, Inc. and/or its subsidiaries.

Tyler Schnoebelen is the Founder and Chief Analyst at Idibon, a company specializing in cloud-based natural language processing. Tyler has ten years of experience in UX design and research in Silicon Valley and holds a Ph.D. from Stanford, where he studied endangered languages and emoticons. He’s been featured in The New York Times Magazine, The Boston Globe, The Atlantic, and NPR. The views expressed are the author’s own, and do not necessarily represent the views of Qualcomm.

The human skull has 14 facial bones and 35 muscles wrapping around these bones. That anatomy works together to form everything from grimaces, to grins, to mouths agape. Beyond the face, there are all kinds of cues that you can use to understand someone: voice contours, body language, and eye contact, to name a few.

All this context disappears when we switch to text. Emojis and emoticons help fill in the gap. They let us express a stance; for instance, “Ok” can connote “I’m a little bothered,” but “Ok :)” means the situation really is okay. As a special bonus, in addition to some 130 available facial expressions, emojis let us style ourselves into sleepy pandas, sparkle tigers, and thousands of otherwise-impossible contortions.

While plasticity is part of what makes emojis fun to use, it’s also what can make them complex to understand. But, as more communication migrates to digital avenues—think about how often you text versus how often you make a phone call—deciphering our 21st-century shorthand is becoming essential.

Unstructured text isn’t going anywhere; in fact, its use is increasing at alarming rates. Every three months, the total word count of the world's SMS messages exceed that of every book ever published. Emojis are exploding, too: They already had millions of users in Japan when Apple created its emoji keyboard in iOS in 2011. Instagram reports that in 2012, about 20 percent of its posts contained emojis; by early this year, that number had grown to 40 percent.

Such rampant interplay of words and symbols compounds the complexity of our language. To illustrate, I conducted a case study in preparation for this article, looking at how emojis were being used to talk about phones on social media. I found, for instance, that people use the skull in messages about their phones about 11 times more than they use the skull in general. What’s more, how they use it is revealing. People use the skull when they’re talking about charging problems, phones being broken, moms taking their phones, or their phones being “dry” (no messages). In other words, skulls help people communicate the metaphor of “without connectivity, I’m dead.” 

This trend reveals a core truth: Humans are adept at taking the resources they have at hand and fitting them into the kinds of meanings they want to convey. The folded hands emoji, for one, was originally designed to symbolize please and thank you, but its uses vary widely. For some it means “I’m praying,” and for others it’s a “high five.” Many people use the syringe to talk about donating blood or getting shots, but others use it to indicate “blood brothers” or to talk about tattoos. And, lucky bamboo, a common Japanese symbol, gets reinterpreted as a middle finger (though not by the Japanese). When writers put two or more emojis together, the order of the characters is also very important. A cloud in front of a car, for instance, might mean “driving into the wind.” But that’s rare; much more frequent is a cloud behind a car, meaning “a fast car.” Thankfully, while you could create more complex sequences, very few people write more than a few emojis in a row. Emoji-only stunt articles and tweets are rare exceptions, the best of which is Emoji Dick, an emoji retelling of a piece of classic literature

As our usage of emoji gains in complexity and frequency, it becomes important that computers understand them, too. Case in point: Big companies want to keep track of customer chatter on social media, chat rooms, and comments boards. There are automated tools that do this, but few of them are able to deal with emoji. A lot gets lost if you gloss over them; for example, "I got a new phone [grin]" lets you know the writer is happy, while "I got a new phone [tearing up]" indicates that they may be sad about the loss of their old phone. The incentive to develop tools that can decipher these messages is clear: Knowing what a customer means, what her complaints or praises might be, and how she’s feeling, is an essential part of the customer-service experience—emoji or no emoji.

Advanced processing schemes, such as machine learning, will be a key part of interpreting and translating emojis in context. Not only will that help companies monitor their reputations, it will also help facilitate our own emoji usage in everyday life. Instagram’s development team is working on tools to define and rank emoji in order to identify trends. And Minuum is using natural language processing to inform its smart emoji keyboard Android plug-in, which suggests emoji based on the message you’re typing or responding to.

Eventually, we will reach a kind of universal emoji understanding. To either a human or a computer, parsing the symbols isn’t all that different from parsing ambiguity inherent to any language. Imagine trying to speak with zero ambiguity—or worse, trying to speak with someone who was unable to understand anything about your sentences other than the literal meaning of your words (“there’s beer in the fridge” becomes an existential statement, as opposed to an invitation). Human communication is rarely perfect, but we usually manage to fit everything together. Emoji are just another piece of the puzzle.