Use GPUs in Cloud Run

375
4
Опубликовано 3 октября 2024, 16:00
Sign up for the preview → goo.gle/3NnobXv
GPU best practices → goo.gle/4elRpBE
Run LLM inference on Cloud Run GPUs with Ollama → goo.gle/3BwN6F1

Cloud Run, known for its scalability, now incorporates GPUs, ushering in a new era for machine learning inference. Join Googlers Martin Omander and Wietse Venema as they provide a practical demonstration of deploying Google's Gemma 2, an open-source large language model, through Ollama on Cloud Run.

Chapters:
0:00 - Intro
0:22 - Google Vertex AI vs GPUs with Cloud Run
1:12 - AI app architecture
2:04 - [Demo] Deploying Ollama API
3:26 - [Demo] Testing the deployment
5:28 - [Demo] Build & deploy the front end
6:02 - How do GPUs scale on Cloud Run?
6:34 - Where are Gemma 2 model files stored?
7:12 - Getting started with GPUs in Cloud Run

More Resources:
Cloud Run pricing → goo.gle/3BeMhAD

Watch more Serverless Expeditions → goo.gle/ServerlessExpeditions
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech

#ServerlessExpeditions #GoogleCloud

Speaker: Martin Omander, Wietse Venema
Products Mentioned: Cloud Run, Gemma
автотехномузыкадетское