Introduction to Primus

70
Опубликовано 10 ноября 2025, 18:01
This talk highlights AMD leadership in high-performance computing and large-scale AI training. Zhenyu introduces Primus, a modular training stack that accelerates time-to-market: Primus-LM (comprehensive parallelism with compute–communication overlap), Primus-Turbo (general-purpose acceleration with ROCm software on AMD GPUs), and Primus-SaFE (three-phase stability architecture: preflight, inflight, postflight). Benchmarks include a 2T-parameter MoE on 1,024 GPUs with DeepEP, end-to-end performance breakdowns, and DeepSeek V3 weak scaling on torchtitan. He also discusses the AMD Instella model family trained at scale on AMD GPUs and concludes with a full out-of-box lifecycle for training developers.

Find the resources you need to develop using AMD products: amd.com/en/developer.html

***

© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Свежие видео
2 дня – 60 9150:16
2025 – Year in Search Trailer
2 дня – 1 06011:59
Repair | Windows 365 Link
3 дня – 1 26911:41
Introducing Flax NNX (Part 2)
3 дня – 285 2876:39
The Trojan Test
автотехномузыкадетское