Automate Product Listings with Gemini + Vision Agents

9 482

10.7

Google Developers2.62 млн

Следующее

30 дней – 26 6060:20

Build real-time agents with Gemini 3.1 Flash Live!

Популярные

17 дней – 7 34111:53

How do AI video generation models work?

146 дней – 9312:33

Annotating Real-World Architecture with Search Grounding and Nano Banana Pro

Опубликовано 27 марта 2026, 16:07

*Build a real-time voice agent with Gemini 3.1 Flash Live and Stream's Vision Agents SDK.*

Stefan Blos, Senior Developer Advocate at Stream, walks through what's possible with early access to the Gemini 3.1 Flash Live model: object detection, AI image polish with Nano Banana, web search, and a guided multi-step workflow, all driven by a single voice conversation.

*What's covered:* Setting up the Vision Agents SDK with the Gemini plugin, defining tools for image generation and product search, building a video processor to analyze live frames, orchestrating multi-step agent workflows with instruction following, and connecting everything to a Next.js frontend via WebSocket events.

Grab your Gemini API key at Google AI Studio and explore the Vision Agents SDK from Stream to get started.

*Resources:*
✅Gemini Hacker Starter Repo → goo.gle/4m1Aj0O
✅GitHub examples → goo.gle/4lVwavg
✅Stream SDK → goo.gle/4dMimkz

What are you building with Gemini Live? Drop it in the comments.

Subscribe to Google for Developers → goo.gle/developers

Speaker: Stefan Blos at Stream
Products Mentioned: Google AI, Gemini

Свежие видео