Accelerating AI inference workloads

2 221
14.8
Следующее
238 дней – 1 2328:49
Google Cloud security solutions
Популярные
20 дней – 1 07613:38
Cloud migration insights from banking
Опубликовано 30 апреля 2024, 16:00
Deploying AI models at scale demands high-performance inference capabilities. Google Cloud offers a range of cloud tensor processing units (TPUs) and NVIDIA-powered graphics processing unit (GPU) VMs. Join Debi Cabrera as she sits down with Alex Spiridonov, Group Product Manager, to discuss key considerations for choosing TPUs and GPUs for your inference needs. Watch along and understand the cost implications, how to deploy and optimize your inference pipeline on Google Cloud, and more!

Chapters:
0:00 - Meet Alex
2:52 - Balancing cost and efficiency
5:51 - TPU vs GPU for AI models
8:21 - Getting started with Google Cloud TPUs and GPUs
10:05 - Common challenges when using inference optimization
12:10 - Available resources for AI inference workloads
13:13 - Wrap up

Resources:
Watch the full session here → goo.gle/3JC32qx
Check out Alex’s blog post → goo.gle/3wa2DZb
JetStream GitHub → goo.gle/49SoSRj
MaxDiffusion GitHub → goo.gle/4aQ1g11
MaxText GitHub → goo.gle/49SoYZb

Watch more Cloud Next 2024 → goo.gle/Next-24
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech

#GoogleCloudNext #GoogleGemini

Event: Google Cloud Next 2024
Speakers: Debi Cabrera, Alex Spiridonov
Products Mentioned: Cloud TPUs, Cloud GPUs
Случайные видео
18 дней – 240 07712:26
I Made My Phones Fun Again
334 дня – 305 44013:55
Samsung S24 Ultra Review: Galaxy Brain
13.11.23 – 1 853 57230:51
NVIDIA SC23 Special Address
13.07.23 – 74 2251:18
FREE RTX 4090!! Plus Announcements
автотехномузыкадетское