AI dataset licensing companies form trade group
Send a link to a friend
[June 26, 2024] By
Katie Paul
NEW YORK (Reuters) - OkSeven content-licensing sellers of music, image,
video and other datasets for use in training artificial intelligence
systems have formed the sector's first trade group, they said on
Wednesday.
The Dataset Providers Alliance (DPA) will advocate for "ethical data
sourcing" in the training of AI systems, including rights for people
depicted in datasets and the protection of content owners' intellectual
property rights, the companies said in a statement.
Founding members include U.S. music dataset company Rightsify, image
licensing service vAIsual, Japanese stock photo provider Pixta and
Germany-based data marketplace Datarade.
The emergence of generative AI technologies that can mimic human
creativity in recent years has triggered an outcry from content creators
and a string of copyright lawsuits against tech companies like Google,
Meta and ChatGPT maker OpenAI, which is backed by Microsoft.
Developers have been training models by feeding them vast quantities of
content, much of it scraped from the internet for free without the
consent of those who created the works or own rights to them.
Tech companies, which claim the usage is legal, are also quietly paying
for access to private collections of content both to fulfill needs for
particular types of data and to hedge against legal and regulatory
risks.
The prospect that demand for licensed data will grow if copyright owners
prevail in their legal fights has prompted the emergence of a nascent
industry of companies that package content and sell access to it for use
by AI systems.
As a result, groups have been formed to establish ethical standards for
that trade, like Fairly Trained, a non-profit founded this year which
certifies models that have not used copyrighted materials without a
license.
[to top of second column] |
AI (Artificial Intelligence) letters and robot hand miniature in
this illustration taken, June 23, 2023. REUTERS/Dado Ruvic/Illustration/File
Photo
The DPA targets the content of those transactions, requiring, for
example, that its members agree not to sell text data obtained by
crawling the web or audio that features people's voices without
their explicit consent.
A heavy focus will be to push for legislation like the NO FAKES Act,
a U.S. bill introduced last year to create penalties for generating
unauthorized digital replicas of people's voices or likenesses, said
Alex Bestall, CEO of Rightsify and its licensing subsidiary GCX, who
led the founding of the group.
"Advocacy will be a big part of it because everyone's taken their
positions on AI and copyright, but a lot of these battles are yet to
be solved and it's going to take a while for them to be," said
Bestall.
The DPA also will press for more training data transparency
requirements like those in the European Union's AI Act and a similar
U.S. bill introduced in April, the Generative AI Copyright
Disclosure Act, he added.
The group plans to publish a white paper outlining its positions in
July, he said.
(Reporting by Katie Paul; Editing by Richard Chang)
[© 2024 Thomson Reuters. All rights
reserved.]
This material may not be published,
broadcast, rewritten or redistributed.
Thompson Reuters is solely responsible for this content.
|