Google Developers2.62 млн
Опубликовано 30 сентября 2025, 16:10
Unlock the full potential of your large language models with Tunix, an innovative open-source JAX-based library for post-training. This video explains the two-stage LLM training process, focusing on how Tunix excels in the post-training phase to instill strong reasoning capabilities. See a practical example of using Tunix with reinforcement learning to improve math problem-solving, leveraging its efficiency on accelerators like Google TPUs. Improve your LLM performance with this powerful tool.
Resources:
GitHub for Tunix → goo.gle/4854A9X
Tunix GRPO example → goo.gle/46M9UwF
Additional examples → goo.gle/4nCfIjE
DeepSeekMath(GRPO) paper → goo.gle/3IA5ukt
Chapters:
0:00 - Introduction to Tunix
0:17 - Understanding LLM training stages
0:35 - Tunix: A JAX-based LLM post-training library
0:50 - Exploring Tunix's capabilities and supported models
1:05 - Reinforcement learning for LLMs overview
1:25 - RLVR for math reasoning demo (GSM8K dataset)
1:50 - Setting up and training with GRPO
2:05 - Tunix performance results and benefits
2:20 - Getting involved with Tunix
Subscribe to Google for Developers → goo.gle/developers
Speaker: Wei Wei
Products Mentioned: Google AI
Resources:
GitHub for Tunix → goo.gle/4854A9X
Tunix GRPO example → goo.gle/46M9UwF
Additional examples → goo.gle/4nCfIjE
DeepSeekMath(GRPO) paper → goo.gle/3IA5ukt
Chapters:
0:00 - Introduction to Tunix
0:17 - Understanding LLM training stages
0:35 - Tunix: A JAX-based LLM post-training library
0:50 - Exploring Tunix's capabilities and supported models
1:05 - Reinforcement learning for LLMs overview
1:25 - RLVR for math reasoning demo (GSM8K dataset)
1:50 - Setting up and training with GRPO
2:05 - Tunix performance results and benefits
2:20 - Getting involved with Tunix
Subscribe to Google for Developers → goo.gle/developers
Speaker: Wei Wei
Products Mentioned: Google AI
Свежие видео
Случайные видео























