The poster had no idea his tweet was one of many scrutinized by an
analytics firm, whose algorithm took his mocking message seriously
and decided it was negative toward gay marriage.
In the race for the White House in 2016, election campaigns rely on
such research to help them tailor advertising and other outreach to
particular groups of voters. A candidate's ability to micro-target
likely voters with ads on issues they care about is crucial in a
modern American political campaign.
Understanding how voters talk about issues on Facebook and Twitter
is key to this effort. But increasingly, data gatherers find
themselves tripped up by basic social media conventions like sarcasm
and mockery. (Graphic: http://reut.rs/1Dh9fgF)
“They are by far the biggest stumbling blocks to trying to
understand true sentiment in social media,” said Michael Meyers,
managing partner of TargetPoint Consulting in Alexandria, Virginia.
His firm, like many, is trying to tweak algorithms in-house now
before the campaign season kicks into high gear.
Consider phrases like “sure it is,” “boo-hoo,” or “I’m shocked.”
Given the proper context, most people can accurately size up their
sincerity. But not a computer algorithm.
The campaign season is too young for there to be any reliable data
yet on the value of analyzing tweets and Facebook posts. Still, data
firms say the analyses will be one important resource, along with
polling and voter records, to help campaigns spend money more
efficiently.
A voter whose online posts indicate unwavering and exclusive support
for Republican White House contender Jeb Bush, for example,
shouldn't get online advertising for his rival Marco Rubio.
"STAY CLASSY"
Using an inaccurate data point generated by sarcasm hurts models and
eats up data scientists’ time as they try to figure out where they
went wrong and build fixes.
Sometimes, the fault lies in certain word combinations as HaystaqDNA
chief executive Ken Strasma, who built software models to target
voters for both Barack Obama's presidential campaigns, recently
found out.
Haystaq, a predictive analysis firm, examined Tweets containing the
expression “classy” and found 72 percent of them used it in a
positive way. But when used near the name of Republican presidential
candidate Donald Trump, around three quarters of tweets citing
"classy" were negative.
Haystaq carried out the Trump exercise to teach its computers how a
word could transform to its opposite meaning when used with a
particular additional word.
As part of its fine-tuning process, the firm also checked for the
combination "Trump" and "stay classy," an iconic phrase from the
2004 hit movie Anchorman that almost always aims to mock. Every one
of those tweets was negative, showing the added firepower of certain
expressions. Haystaq has adjusted its algorithms to compensate.
At Two.42.solutions, chief technology officer Mohammad Hamid had to
teach his computers that when the normally neutral term “hair” is
mentioned in a tweet about the well-coiffed Trump, that’s usually
sarcastic.
[to top of second column] |
He pointed to a June tweet from filmmaker Albert Brooks: “Donald
Trump announces this morning that he will run for president. His
hair will announce on Friday.”
FLIPPANT TWITTER
Another word pairing that Hamid said often indicates sarcasm in an
otherwise neutral or positive tweet: “Hillary,” meaning Democratic
contender Hillary Clinton, and “Benghazi,” the Libyan city where a
U.S. diplomatic mission was attacked in 2012 while she was secretary
of state.
In April, for example, one tweeter posted that he had received an
email saying Clinton was running for president. "Now the email is
gone and I can't find it. #Benghazi," he tweeted, in an apparent nod
to the strongly held belief by some conservatives that Clinton
orchestrated a cover-up over the Benghazi attack and withheld
information that should have been made public, a claim she and
others have dismissed as a conspiracy theory.Sarcasm clues can also
come from the identity of the tweeter, Hamid said. For example, a
seemingly pro-Republican tweet or Facebook post coming from someone
who follows mostly Democrats would trigger an alert that the
evaluation of the tweet might not be accurate.
Almost all analysts caution against putting too much stock in social
media, particularly Twitter given that only one fifth of U.S. adults
use it, according to the Pew Research Center. But those that do tend
to strike a sarcastic tone.
“There’s something about that 140 characters (limit) that encourages
people to be more flippant,” Meyers of TargetPoint said.
In the case of the tweet from @xTomatoez, human backstops at Two.42
caught the misclassification of a message that made fun of
traditional values more than it decried gay marriage.
The algorithm marked the tweet for review by human eyes because it
contained what researchers call a double negative, or two negative
words in one tweet, an automatic red flag indicating possible
accuracy challenges. In this case, the telltale words were “tough”
and “redneck,” said researcher Abby Vandenbosch.
“Why would anyone want to analyze my tweets?” asked the poster, also
known as Eddie Adlman, when a reporter contacted him. When told he
was too mocking for classification by machine, he responded with a
complicated emoticon for “shrug.”
Enough to confuse many humans, and perhaps an algorithm or two as
well.
(Reporting by Sarah McBride, editing by Ross Colvin)
[© 2015 Thomson Reuters. All rights
reserved.]
Copyright 2015 Reuters. All rights reserved. This material may not be published,
broadcast, rewritten or redistributed. |