OpenAI launches new voice intelligence features in its API
Original: OpenAI launches new voice intelligence features in its API
Why This Matters
Advances real-time voice AI capabilities for enterprise applications and developer platforms
OpenAI announced Thursday that its API now includes new voice intelligence features to help developers create conversational apps. The updates include GPT-Realtime-2 with GPT-5-class reasoning, GPT-Realtime-Translate supporting 70+ input languages and 13 output languages, and GPT-Realtime-Whisper for live speech-to-text transcription.
OpenAI introduced three new voice intelligence capabilities in its Realtime API. GPT-Realtime-2 is an advanced voice model with GPT-5-class reasoning designed to handle complex user requests, improving on its predecessor GPT-Realtime-1.5. GPT-Realtime-Translate provides real-time translation across more than 70 input languages and 13 output languages to keep pace with conversational flow. GPT-Realtime-Whisper offers live speech-to-text transcription captured during interactions. The company stated these models move real-time audio beyond simple call-and-response toward voice interfaces that can listen, reason, translate, transcribe, and take action during conversations. Target applications include customer service, education, media, events, and creator platforms. OpenAI has implemented guardrails to prevent misuse for spam, fraud, or abuse, with triggers to halt conversations violating content guidelines.