Google Developers2.61 млн
Опубликовано 6 марта 2026, 20:47
Don't let device failures or power outages ruin your training runs. In this tutorial, Yufeng Guo demonstrates how to use Keras with the Orbax checkpointing library. Learn how to implement a custom checkpoint manager and Keras callbacks to ensure your model state is always safely stored.
0:00 Introduction to Orbax & Keras Integration
0:39 Exploring Keras Checkpointing
1:11 Why Extend Keras for Multi-Host Environments?
1:48 What is Orbax?
2:29 Building Utility Classes: KerasOrbaxCheckpointManager & OrbaxCheckpointCallback
2:57 Deep Dive into KerasOrbaxCheckpointManager
3:45 Coding the Get, Save, and Restore State Functions
4:37 Implementing the OrbaxCheckpointCallback
5:12 Protecting Against Device Failures & Preemption
5:31 Implementation Details & Model.fit Integration
6:07 Checkpointing in Action: File Directory Walkthrough
6:56 Summary & Final Tips
Resources:
Orbax checkpointing in Keras - Developer guide → goo.gle/40T2LI8
ModelCheckpoint - Keras 3 API documentation → goo.gle/3PkAlEq
Subscribe to Google for Developers → goo.gle/developers
Speaker: Yufeng Guo
Products Mentioned: Google AI
0:00 Introduction to Orbax & Keras Integration
0:39 Exploring Keras Checkpointing
1:11 Why Extend Keras for Multi-Host Environments?
1:48 What is Orbax?
2:29 Building Utility Classes: KerasOrbaxCheckpointManager & OrbaxCheckpointCallback
2:57 Deep Dive into KerasOrbaxCheckpointManager
3:45 Coding the Get, Save, and Restore State Functions
4:37 Implementing the OrbaxCheckpointCallback
5:12 Protecting Against Device Failures & Preemption
5:31 Implementation Details & Model.fit Integration
6:07 Checkpointing in Action: File Directory Walkthrough
6:56 Summary & Final Tips
Resources:
Orbax checkpointing in Keras - Developer guide → goo.gle/40T2LI8
ModelCheckpoint - Keras 3 API documentation → goo.gle/3PkAlEq
Subscribe to Google for Developers → goo.gle/developers
Speaker: Yufeng Guo
Products Mentioned: Google AI
Случайные видео























