OpenAI builds WebRTC architecture for low-latency voice AI
Original: How OpenAI delivers low-latency voice AI at scale
Why This Matters
Demonstrates how AI companies solve real-time infrastructure challenges at massive scale
OpenAI developed a split relay plus transceiver architecture to deliver low-latency voice AI to over 900 million weekly active users globally, addressing infrastructure constraints at scale using WebRTC standards.
OpenAI redesigned its WebRTC stack to support real-time voice AI for ChatGPT voice, Realtime API developers, and interactive agents. The company identified three key requirements: global reach for 900+ million weekly users, fast connection setup, and low media round-trip time with minimal jitter. The team built a split relay plus transceiver architecture to solve scaling constraints including one-port-per-session limitations and stateful ICE/DTLS session management. WebRTC provides standardized solutions for connectivity establishment, NAT traversal, encrypted transport, codec negotiation, and quality control. OpenAI leverages the existing WebRTC ecosystem and hired key contributors Justin Uberti and Sean DuBois to guide real-time AI development.