Accelerate AI inference workloads with Google Cloud TPUs and GPUs

1 132
23.6
Следующее
Популярные
108 дней – 1 9540:22
Advice for devs new to AI
Опубликовано 1 июля 2024, 15:39
Deploying AI models at scale demands high-performance inference capabilities. Google Cloud offers a range of cloud tensor processing units (TPUs) and NVidia-powered graphics processing unit (GPU) VMs. This session will guide you through the key considerations for choosing TPUs and GPUs for your inference needs. Explore the strengths of each accelerator for various workloads like large language models and generative AI models. Discover how to deploy and optimize your inference pipeline on Google Cloud using TPUs or GPUs. Understand the cost implications and explore cost-optimization strategies.

Speakers: Alexander Spiridonov, Omer Hasan, Uğur Arpaci, Kirat Pandya

Watch more:
All sessions from Google Cloud Next → goo.gle/next24

#GoogleCloudNext


Event: Google Cloud Next 2024
Случайные видео
288 дней – 5 9550:25
Distractions during the standup.
23.06.23 – 10 8521:18
How Microsoft uses AI for good
14.06.21 – 89240:44
Vive Influencers - Blitz
автотехномузыкадетское