The two new benchmarks added by MLCommons measure the speed at
which the AI chips and systems can generate responses from the
powerful AI models packed with data. The results roughly
demonstrate to how quickly an AI application such as ChatGPT can
deliver a response to a user query.
One of the new benchmarks added the capability to measure the
speediness of a question-and-answer scenario for large language
models. Called Llama 2, it includes 70 billion parameters and
was developed by Meta Platforms.
MLCommons officials also added a second text-to-image generator
to the suite of benchmarking tools, called MLPerf, based on
Stability AI's Stable Diffusion XL model.
Servers powered by Nvidia's H100 chips built by the likes of
Alphabet's Google, Supermicro and Nvidia itself handily won both
new benchmarks on raw performance. Several server builders
submitted designs based on the company's less powerful L40S
chip.
Server builder Krai submitted a design for the image generation
benchmark with a Qualcomm AI chip that draws significant less
power than Nvidia's cutting edge processors.
Intel also submitted a design based on its Gaudi2 accelerator
chips. The company described the results as "solid."
Raw performance is not the only measure that is critical when
deploying AI applications. Advanced AI chips suck up enormous
amounts of energy and one of the most significant challenges for
AI companies is deploying chip that deliver an optimal amount of
performance for a minimal amount of energy.
MLCommons has a separate benchmark category for measuring power
consumption.
(Reporting by Max A. Cherney in San Francisco; Editing by Jamie
Freed)
[© 2024 Thomson Reuters. All rights
reserved.]
Copyright 2022 Reuters. All rights reserved. This material may
not be published, broadcast, rewritten or redistributed.
Thompson Reuters is solely responsible for this content.
|
|