How to fine-tune LLMs for with Tunix

44 791

9.9

Google Developers2.62 млн

Следующее

197 дней – 5 3340:43

What’s your go-to tip to grow as a new dev? 🤔

Популярные

79 дней – 4 90326:31

The Smoke Jumpers: Scaling Gemini’s serving infrastructure

132 дня – 3 5333:37

Profiling Pytorch/XLA on TPUs with XProf

Опубликовано 30 сентября 2025, 16:10

Unlock the full potential of your large language models with Tunix, an innovative open-source JAX-based library for post-training. This video explains the two-stage LLM training process, focusing on how Tunix excels in the post-training phase to instill strong reasoning capabilities. See a practical example of using Tunix with reinforcement learning to improve math problem-solving, leveraging its efficiency on accelerators like Google TPUs. Improve your LLM performance with this powerful tool.

Resources:
GitHub for Tunix → goo.gle/4854A9X
Tunix GRPO example → goo.gle/46M9UwF
Additional examples → goo.gle/4nCfIjE
DeepSeekMath(GRPO) paper → goo.gle/3IA5ukt

Chapters:
0:00 - Introduction to Tunix
0:17 - Understanding LLM training stages
0:35 - Tunix: A JAX-based LLM post-training library
0:50 - Exploring Tunix's capabilities and supported models
1:05 - Reinforcement learning for LLMs overview
1:25 - RLVR for math reasoning demo (GSM8K dataset)
1:50 - Setting up and training with GRPO
2:05 - Tunix performance results and benefits
2:20 - Getting involved with Tunix

Subscribe to Google for Developers → goo.gle/developers

Speaker: Wei Wei
Products Mentioned: Google AI

Свежие видео