AMD Developer Central29.9 тыс
Опубликовано 10 ноября 2025, 18:00
This talk introduces SGLang, a high-performance serving framework for large language models (LLMs) and vision-language models (VLMs), and reviews key advancements achieved in 2025. Yineng Zhang covers optimizations for DeepSeek V3 that improve throughput and latency, large-scale production deployments, and the integration of reinforcement learning to adapt serving policies under real workloads. The session details training acceleration via speculative decoding, hierarchical KV caching for memory efficiency at scale, and deterministic inference for reproducibility and compliance. He also highlights day-0 support for new model families, robust model deployment orchestration, and distributed inference on AMD platforms to unlock cost-effective performance.
Find the resources you need to develop using AMD products: amd.com/en/developer.html
***
© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Find the resources you need to develop using AMD products: amd.com/en/developer.html
***
© 2025 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, EPYC, ROCm, and AMD Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc.
Свежие видео























