Academic institutions and healthcare companies are picking sides
between their cloud computing offerings - Google Genomics or Amazon
Web Services - spurring the two to one-up each other as they win
high-profile genomics business, according to interviews with
researchers, industry consultants and analysts.
That growth is being propelled by, among other forces, the push for
personalized medicine, which aims to base treatments on a patient's
DNA profile. Making that a reality will require enormous quantities
of data to reveal how particular genetic profiles respond to
different treatments.
Already, universities and drug manufacturers are embarking on
projects to sequence the genomes of hundreds of thousands of people.
The human genome is the full complement of DNA, or genetic material,
a copy of which is found in nearly every cell of the body.
Clients view Google and Amazon as doing a better job storing
genomics data than they can do using their own computers, keeping it
secure, controlling costs and allowing it to be easily shared.
The cloud companies are going beyond storage to offer analytical
functions that let scientists make sense of DNA data. Microsoft Corp
and International Business Machines are also competing for a slice
of the market. The "cloud" refers to data or software that
physically resides in a server and is accessible via the internet,
which allows users to access it without downloading it to their own
computer.
Now an estimated $100 million to $300 million business globally, the
cloud genomics market is expected to grow to $1 billion by 2018,
said research analyst Daniel Ives of investment bank FBR Capital. By
that time, the entire cloud market should have $50 billion to $75
billion in annual revenue, up from about $30 billion now.
"The cloud is the entire future of this field," Craig Venter, who
led a private effort to sequence the human genome in the 1990s, said
in an interview. His new company, San Diego-based Human Longevity
Inc, recently tried to import genomic data from servers at the J.
Craig Venter Institute in Rockville, Maryland.
The transmission was so slow, scientists had to resort to sending
disks and thumb drives by FedEx and human messengers, or
"sneakernet," he said. The company now uses Amazon Web Services.
So does a collaboration between Regeneron Pharmaceuticals Inc and
Pennsylvania-based Geisinger Health Systems to sequence 250,000
genomes. Raw DNA data is uploaded to Amazon's cloud, where software
from privately-held DNAnexus assembles the millions of chunks into
the full, 3-billion-letter long genome.
DNAnexus's algorithms then determine where an individual genome
differs from the "reference" human genome, the company’s chief
scientist Dr. David Shaywitz said, in hopes of identifying new drug
targets.
HOSTING FOR FREE
Showing how important Google and Amazon view this business, and how
they hope to use existing customers to lure future ones, each is
hosting well-known genomics datasets for free.
Neither company discloses the amount of genomics data it holds, but
based on interviews with analysts and genomic scientists, as well as
the companies' own announcements of what customers they’ve won,
Amazon Web Services may be bigger.
Data from the "1000 Genomes Project," an international
public-private effort that identified genetic variations found in at
least 1 percent of humans, reside at both Amazon and Google "without
charge," said Kathy Cravedi of the U.S. National Institutes of
Health (NIH), one of the project's sponsors.
Other paying clients with a more specific focus are picking sides.
Google, for instance, won a project from the Autism Speaks
foundation to collect and analyze the genomes of 10,000 affected
children and their parents for clues to the genetic basis of autism.
Another customer is Tute Genomics, whose database of 8.5 billion
human DNA variants can be searched for how frequently any given
variant appears, what traits it's associated with and how people
with a certain variant respond to particular drugs.
[to top of second column] |
Amazon is hosting the Multiple Myeloma Research Foundation’s project
to collect complete-genome sequences and other data from 1,000
patients to identify new drug targets. It also won the Alzheimer's
Disease Sequencing Project, which has similar aims.
Amazon charges about $4 to $5 a month to store one full human
genome, and Google about $3 to $5 a month. The companies also charge
for data transfers or computing time, as when scientists run
analytical software on stored data.
Amazon's database-analysis tool, Redshift, costs 25 cents an hour or
$1,000 per terabyte per year, the company said. A terabyte is 1
trillion bytes, or 1,000 gigabytes, about enough to hold 300 hours
of high-quality video.
GENETIC GOLD
Another part of the cloud services' pitch to would-be customers is
that their analytic tools can fish out genetic gold - a drug target,
say, or a DNA variant that strongly predicts disease risk - from a
sea of data. Any discoveries made through such searches belong to
the owners of the data.
"On the local university server it might take months to run a
computationally-intense" analysis, said Alzheimer’s project leader
Dr. Gerard Schellenberg of the University of Pennsylvania. "On
Amazon, it's, 'how fast do you need it done?', and they do it."
Another selling point is security. Universities are "generally
pretty porous," said Ryan Permeh, chief scientist at cybersecurity
company Cylance Inc, of Irvine, California, and the security of
federal government computers is "not at the top of the class."
While academic and pharmaceutical research projects are the biggest
customers for genomics cloud services, they will be overtaken by
clinical applications in the next 10 years, said Google Genomics
director of engineering David Glazer.
Individual doctors will regularly access a cloud service to
understand how a patient's genetic profile affects his risk of
various diseases or his likely response to medication.
"We are at that transition point now," Glazer said.
Matt Wood, general manager for Data Science at Amazon Web Services,
sees cloud demand in genomics now as "a perfect storm," as the
amount of data being created, the need for collaboration and the
move of genomics into clinical care accelerate.
Experts on DNA and data say without access to the cloud, modern
genomics would grind to a halt.
Bioinformatics expert Dr. Atul Butte of the University of
California, San Francisco, said that now, when researchers at
different universities are jointly working on NIH and other genomic
data, they don't have to figure out how to make their computers talk
to each other. In March, NIH cleared the way for major research on
the cloud when it began allowing scientists to upload important
genomic data.
"My response was, it's about time," Butte said.
(This story corrects name of MMRF, adding "Research," in the 19th
paragraph)
(Reporting by Sharon Begley and Caroline Humer; Editing by Michele
Gershberg and John Pickering)
[© 2015 Thomson Reuters. All rights
reserved.] Copyright 2015 Reuters. All rights reserved. This material may not be published,
broadcast, rewritten or redistributed.
|