AMD Developer Central16.8 тыс
Опубликовано 10 ноября 2025, 18:01
This talk highlights AMD leadership in high-performance computing and large-scale AI training. Zhenyu introduces Primus, a modular training stack that accelerates time-to-market: Primus-LM (comprehensive parallelism with compute–communication overlap), Primus-Turbo (general-purpose acceleration with ROCm software on AMD GPUs), and Primus-SaFE (three-phase stability architecture: preflight, inflight, postflight). Benchmarks include a 2T-parameter MoE on 1,024 GPUs with DeepEP, end-to-end performance breakdowns, and DeepSeek V3 weak scaling on torchtitan. He also discusses the AMD Instella model family trained at scale on AMD GPUs and concludes with a full out-of-box lifecycle for training developers.
Find the resources you need to develop using AMD products: amd.com/en/developer.html
***
© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Find the resources you need to develop using AMD products: amd.com/en/developer.html
***
© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Свежие видео






















