Google Cloud Platform1.32 млн
Опубликовано 2 февраля 2026, 17:00
Evaluating agents with ADK code lab → goo.gle/3NRVhSB
Evaluating single LLM outputs With Vertex AI evaluation code lab → goo.gle/4jYfYZ0
Large language models (LLMs) are great but sometimes they go rogue. How can developers gain real confidence in their AI systems while in production? Join Aja and Jason as they demonstrate how to implement an 'accuracy pipeline' using LLMs as your ultimate grading rubric, treating each evaluation prompt like a shiny new unit test.
Chapters:
0:00 - Intro
0:48 - What is hallucination?
2:35 - Testing AI answers for accuracy
5:43 - Offline evaluation
8:53 - Summary
More resources:
Agent evaluation in Vertex AI Gen AI evaluation service → goo.gle/3M0l3Dw
Watch more Real Terms for AI → goo.gle/AIwordsExplained
🔔 Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
#GoogleCloud #AIInfrastructure
Speaker: Aja Hammerly, Jason Davenport
Products Mentioned: AI Infrastructure
Evaluating single LLM outputs With Vertex AI evaluation code lab → goo.gle/4jYfYZ0
Large language models (LLMs) are great but sometimes they go rogue. How can developers gain real confidence in their AI systems while in production? Join Aja and Jason as they demonstrate how to implement an 'accuracy pipeline' using LLMs as your ultimate grading rubric, treating each evaluation prompt like a shiny new unit test.
Chapters:
0:00 - Intro
0:48 - What is hallucination?
2:35 - Testing AI answers for accuracy
5:43 - Offline evaluation
8:53 - Summary
More resources:
Agent evaluation in Vertex AI Gen AI evaluation service → goo.gle/3M0l3Dw
Watch more Real Terms for AI → goo.gle/AIwordsExplained
🔔 Subscribe to Google Cloud Tech → goo.gle/GoogleCloudTech
#GoogleCloud #AIInfrastructure
Speaker: Aja Hammerly, Jason Davenport
Products Mentioned: AI Infrastructure
Свежие видео
Случайные видео























