AMD Developer Central23.6 тыс
Опубликовано 30 июня 2025, 14:00
Join Lianmin Zheng, Member of Technical Staff at xAI and Leader of SGLang project, as he speaks at Advancing AI for a second year. This talk covers the high-level overview of SGLang, a fast inference engine for large language models and vision-language models, and its application in large-scale production environments using AMD GPUs. Learn about the latest advancements in preview-decode disaggregation and expert parallelism, and how these techniques can significantly enhance inference performance and efficiency.
Discover how SGLang supports major open-weight models like DeepSeek, Llama, and Qwen, and how it integrates with reinforcement learning workflows. Dr. Lang shares real-world insights from xAI's collaboration with AMD, including the implementation of Day 0 support for DeepSeek V3R1 and the first open-source implementation of large-scale expert parallelism.
Key takeaways include:
• Efficient design and implementation of preview-decode disaggregation
• Strategies for large-scale expert parallelism
• Practical insights into deploying SGLang on over 100 GPUs
• Collaboration highlights with AMD for optimized performance
Learn how to deploy DeepSeek-R1 with SGLang: rocm.docs.amd.com/projects/ai-...
Learn how DeepSeek-V3 is optimized on AMD Instinct™ accelerators: amd.com/en/developer/resources...
Find the resources you need to develop using AMD products: amd.com/en/developer.html
***
© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Discover how SGLang supports major open-weight models like DeepSeek, Llama, and Qwen, and how it integrates with reinforcement learning workflows. Dr. Lang shares real-world insights from xAI's collaboration with AMD, including the implementation of Day 0 support for DeepSeek V3R1 and the first open-source implementation of large-scale expert parallelism.
Key takeaways include:
• Efficient design and implementation of preview-decode disaggregation
• Strategies for large-scale expert parallelism
• Practical insights into deploying SGLang on over 100 GPUs
• Collaboration highlights with AMD for optimized performance
Learn how to deploy DeepSeek-R1 with SGLang: rocm.docs.amd.com/projects/ai-...
Learn how DeepSeek-V3 is optimized on AMD Instinct™ accelerators: amd.com/en/developer/resources...
Find the resources you need to develop using AMD products: amd.com/en/developer.html
***
© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Свежие видео
Случайные видео























