Extreme Co-Design for Efficient Tokenomics and AI at Scale

5 359
7.2
NVIDIA2.14 млн
Следующее
14 дней – 46 6032:51
What’s Next in Robotics?
Популярные
33 дня – 4 0650:17
Open-Source AI Momentum
Опубликовано 12 февраля 2026, 1:49
As AI enters the era of real-time reasoning, the key metric for deploying AI at scale is now cost per token — how much it costs to generate intelligence.

Reasoning models like mixture-of-experts (MoE) generate massive volumes of tokens to deliver higher-quality results, placing pressure on the entire system — from compute and memory to networking, storage, and software.

Featuring insights from NVIDIA, Signal65, Microsoft Azure, and CoreWeave, this discussion explains why extreme co-design — optimizing the full stack as a unified system — is essential to lowering cost per token and maximizing AI ROI, making end-to-end system design the most powerful lever for scaling efficient AI.

Learn more: blogs.nvidia.com/blog/inferenc...
автотехномузыкадетское