Stream Launches AI Vision Agents

First Open-Platform, Video-First SDK for Real-Time Vision AI
 
BOULDER, Colo. - Oct. 14, 2025 - PRLog -- Stream, the leading provider of scalable chat, video, and feeds APIs, today announced Vision Agents, the first open-source, open-platform SDK bringing real-time video and audio intelligence into developer applications.

Unlike existing frameworks that bolt video onto voice-first systems, Vision Agents were designed video-first from day one.

"Most frameworks started with voice and later added video," said Thierry Schellenbach, CEO and Co-Founder of Stream. "We built the opposite: a video-first foundation that's open, extensible, and developer-friendly."

Developers can now create AI-powered agents that see, hear, and remember in real time, enabling a new generation of interactive, multimodal applications.

Open Platform for AI Innovation

Vision Agents works with Stream Video by default but also integrates with other video SDKs and supports AI providers, including OpenAI Realtime, Google Gemini, and custom models. This flexibility lets companies adopt Vision Agents without disrupting existing infrastructure, while Stream Video and Chat users gain deep integrations for memory, messaging, and performance.

Real-Time, Video-First Intelligence

Vision Agents process live video with low latency, enabling real-time perception, scene detection, and natural audio or text responses. Core features include:
  • Video-first intelligence for scene understanding.
  • Real-time audio with transcription, speech, and voice activity detection.
  • Memory and context to recall details naturally.
  • Action-ready design to connect with external APIs and services.

Wide-Ranging Applications

Use cases span manufacturing (defect detection), collaboration (AI note-taking, transcription), gaming (coaching, avatars), accessibility (captions, descriptions), and customer support (multimodal assistants).

Open Source and Availability

Fully open-source, Vision Agents invites community contributions to extend providers and tools.

"Vision AI today feels like ChatGPT in 2022, it's just beginning to show what's possible," said Thierry Schellenbach, CEO and Co-Founder of Stream.

Developers and partners can contribute new processors, adapters, and integrations directly on GitHub: https://github.com/GetStream/Vision-Agents

Contact
GetStream.io
***@getstream.io
End
Source: » Follow
Email:***@getstream.io
Posted By:***@getstream.io Email Verified
Tags:Vision AI
Industry:Technology
Location:Boulder - Colorado - United States
Subject:Products
Account Email Address Verified     Account Phone Number Verified     Disclaimer     Report Abuse
Trending
Most Viewed
Daily News



Like PRLog?
9K2K1K
Click to Share