GKE Gemma 2 deployment with Hugging Face

1 160
11
Опубликовано 15 ноября 2024, 17:00
Tutorial: Serve Gemma on GKE with TGI → goo.gle/4fFKt2Q
Learn more about TGI (text generation inference) from Hugging Face → goo.gle/4e7qusz
Hugging Face Deep Learning containers for Google Cloud → goo.gle/3BPaYUM

Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high performance text generation for the most popular open LLMs. Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Watch along as Googlers Wietse Venema and Mofi Rahman demonstrate how to deploy Gemma 2 with 27 billion parameters on Google Kubernetes Engine using Hugging Face TGI.

Watch more Google Cloud: Building with Hugging Face → goo.gle/BuildWithHuggingFace
Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech

#GoogleCloud #HuggingFace

Speakers: Wietse Venema, Mofi Rahman
Products Mentioned: Gemma, Hugging Face Deep Learning containers, Google Kubernetes Engine
автотехномузыкадетское