WebLLM: A high-performance in-browser LLM Inference engine

16 530

Google Chrome Developers777 тыс

Следующее

112 дней – 3 15417:09

State isn't all you need, but It helps: building better LLM apps in the browser

Популярные

136 дней – 4 92250:45

A deep dive into WebAssembly with Thomas Nattestad - WasmAssembly

156 дней – 9300:57

Looking back on Season 5

Опубликовано 21 ноября 2024, 18:00

In tis talk, Charlie Ruan from MLC will focus on WebLLM, a high-performance in-browser LLM inference engine. WebLLM allows building AI-enabled web apps that are fast (native GPU acceleration via WebGPU), private (100% client-side computation), and convenient (zero environment setup). For developers, WebLLM features an OpenAI-API style interface for standardized integration, supports chat applications and efficient structured JSON generation, and offers built-in support for Web/Service Workers to separate backend executions from the UI flow. In this talk, we will explore WebLLM’s key features, overall architecture, and how developers can build AI-enabled web applications with it.

Try Web LLM → goo.gle/3YluAr9

See more Web AI talks → goo.gle/web-ai
Subscribe to Chrome for Developers → goo.gle/ChromeDevs

Speaker: Charlie Ruan
Products mentioned: AI for the web, Google Chrome Browser, Chrome Browser Automation, Chrome Extensions, Chrome, Chrome Web Platform, Web AI, Web apps, Web Assembly (Wasm), Web Platform in Chrome, WebAssembly for Chrome, WebGPU, CodeGemma, Gemma 2, Gemma, RecurrentGemma, Generative AI, AI, Google AI, Google AI Edge, Responsible AI, Kaggle Models, LiteRT, TensorFlow, Hugging Face Models

Свежие видео