When the AI goes haywire, bring on the humans
Send a link to a friend
[October 13, 2022] By
Paresh Dave
OAKLAND, Calif. (Reuters) - Used by
two-thirds of the world's 100 biggest banks to aid lending decisions,
credit scoring giant Fair Isaac Corp and its artificial intelligence
software can wreak havoc if something goes wrong.
That crisis nearly came to pass early in the pandemic. As FICO recounted
to Reuters, the Bozeman, Montana company's AI tools for helping banks
identify credit and debit card fraud concluded that a surge in online
shopping meant fraudsters must have been busier than usual.
The AI software told banks to deny millions of legitimate purchases, at
a time when consumers had been scrambling for toilet paper and other
essentials.
But consumers ultimately faced few denials, according to FICO. The
company said a global group of 20 analysts who constantly monitor its
systems recommended temporary adjustments that avoided a blockade on
spending. The team is automatically alerted to unusual buying activity
that could confuse the AI, relied on by 9,000 financial institutions
overall to detect fraud across 2 billion cards.
Such corporate teams, part of the emerging job specialty of machine
learning operations (MLOps), are unusual. In separate surveys last year,
FICO and the consultancy McKinsey & Co found that most organizations
surveyed are not regularly monitoring AI-based programs after launching
them.
The problem is that errors can abound when real-world circumstances
deviate, or in tech parlance "drift," from the examples used to train
AI, according to scientists managing these systems. In FICO's case, it
said its software expected more in-person than virtual shopping, and the
flipped ratio led to a greater share of transactions flagged as
problematic.
Seasonal variations, data-quality changes or momentous events - such as
the pandemic - all can lead to a string of bad AI predictions.
Imagine a system recommending swimsuits to summer shoppers, not
realizing that COVID lockdowns had made sweatpants more suitable. Or a
facial recognition system becoming faulty because masking had become
popular.
The pandemic must have been a "wake-up call" for anyone not closely
monitoring AI systems because it induced countless behavioral shifts,
said Aleksander Madry, director of the Center for Deployable Machine
Learning at Massachusetts Institute of Technology.
Coping with drift is a huge problem for organizations leveraging AI, he
said. "That's what really stops us currently from this dream of AI
revolutionizing everything."
Adding to the urgency for users to address the issue, the European Union
plans to pass a new AI law as soon as next year requiring some
monitoring. The White House this month in new AI guidelines also called
for monitoring to ensure system "performance does not fall below an
acceptable level over time."
Being slow to notice issues can be costly. Unity Software Inc, whose ad
software helps video games attract players, in May estimated that it
would lose $110 million in sales this year, or about 8% of total
expected revenue, after customers pulled back when its AI tool that
determines whom to show ads to stopped working as well as it once did.
Also to blame was its AI system learning from corrupted data, the
company said.
Unity, based in San Francisco, declined to comment beyond earnings-call
statements. Executives there said Unity was deploying alerting and
recovery tools to catch problems faster and acknowledged expansion and
new features had taken precedence over monitoring.
[to top of second column] |
Shoppers practice social distancing as
they wait with their carts to enter a Target store in Manhattan
during the outbreak of the coronavirus disease (COVID-19) in New
York City, New York, U.S., April 1, 2020. REUTERS/Brendan Mcdermid/File
Photo
Real estate marketplace Zillow Group Inc last November announced a
$304 million writedown on homes it bought - based on a
price-forecasting algorithm - for amounts higher than they could be
resold for. The Seattle company said the AI could not keep pace with
rapid and unprecedented market swings and exited the buying-selling
business.
NEW MARKET
AI can go awry in many ways. Most well known is that training data
skewed along race or other lines can prompt unfairly biased
predictions. Many companies now vet data beforehand to prevent this,
according to the surveys and industry experts. By comparison, few
companies consider the danger of a well-performing model that later
breaks, those sources say.
"It's a pressing problem," said Sara Hooker, head of research lab
Cohere For AI. "How do you update models that become stale as the
world changes around it?"
Several startups and cloud computing giants in the past couple of
years have started selling software to analyze performance, set
alarms and introduce fixes that together intend to help teams keep
tabs on AI. IDC, a global market researcher, estimates spending on
tools for AI operations to reach at least $2 billion in 2026 from
$408 million last year.
Venture capital investment in AI development and operations
companies rose last year to nearly $13 billion, and $6 billion has
poured in so far this year, according to data from PitchBook, a
Seattle company tracking financings.
Arize AI, which raised $38 million from investors last month,
enables monitoring for customers including Uber, Chick-fil-A and
Procter & Gamble. Chief Product Officer Aparna Dhinakaran said she
struggled at a previous employer to quickly spot AI predictions
turning poor and friends elsewhere told her about their own delays.
"The world of today is you don't know there's an issue until a
business impact two months down the road," she said.
FRAUD SCORES
Some AI users have built their own monitoring capabilities and that
is what FICO said saved it at the start of the pandemic.
Alarms were triggered as more purchases occurred online - what the
industry calls "card not present." Historically, more of this
spending tends to be fraudulent and the surge pushed transactions
higher on FICO's 1-to-999 scale (the higher it is, the more likely
it is fraud), said Scott Zoldi, chief analytics officer at FICO.
Zoldi said consumer habits were changing too fast to rewrite the AI
system. So FICO advised U.S. clients to review and reject only
transactions scored above 900, up from 850, he said. It spared
clients from reviewing 67% of legitimate transactions above the old
threshold, and allowed them instead to focus on truly problematic
cases.
Clients went on to detect 25% more of total U.S. fraud during the
first six months of the pandemic than would have been expected and
60% more in the United Kingdom, Zoldi said.
"You are not responsible with AI unless you are monitoring," he
said.
(Reporting by Paresh Dave; Editing by Kenneth Li and Claudia
Parsons)
[© 2022 Thomson Reuters. All rights
reserved.]
This material may not be published,
broadcast, rewritten or redistributed.
Thompson Reuters is solely responsible for this content.
|