Building real-time voice applications with Live API

10 333
12.7
Опубликовано 5 августа 2025, 23:00
Shrestha Basu Mallick, one of the product leads for the Gemini API, joins host Logan Kilpatrick for a deep dive of Gemini Live API, Google’s real-time, multimodal interface for developers. Learn about how native audio alongside new capabilities like proactive audio and async function calling unlocks the unique power of audio as an interface.

0:00 - Intro
1:18 - Live API OVERVIEW
3:36 - Why audio is a special modality
5:07 - Speed vs. precision in audio
6:17 - Controllable and promptable TTS
8:31 - What developers are building with the Live API
11:14 - URL context and async calling features
15:02 - Proactive audio and affective dialog
16:55 - Addressing developer feedback
21:54 - Live API roadmap
23:49 - The role of long context
24:57 - What’s next for the Live API
26:41 - State of the AI audio market
30:10 - Advice for developers getting started with the Live API
31:16 - Live API demo
38:10 - Demo wrap up and closing

Listen to this podcast:
Apple Podcasts → goo.gle/3Bm7QzQ
Spotify → goo.gle/3ZL3ADl

Watch more Release Notes → goo.gle/4njokfg
Subscribe to Google for Developers → goo.gle/developers

Speaker: Logan Kilpatrick, Shrestha Basu Mallick
Products Mentioned: Google AI
автотехномузыкадетское