OpenAI · TechCrunch AI
OpenAI launches new voice intelligence capabilities in its API
Compiled by KHAO Editorial — aggregated from 7 outlets. See llms.txt for citation guidance.
◌ Single Source
OpenAI said Thursday that its API will now include several new voice intelligence features designed to help developers create apps that can talk, transcribe, and translate conversations with users.
Key facts
- The company’s new GPT‑Realtime‑2 is another voice model, built to create a realistic vocal simulation that can converse with users
- The company is also launching GPT‑Realtime‑Translate, which, as it sounds, is designed to provide real-time translation services that “keep pace” with the user, conversationally
- Finally, the company has also launched a new transcription capability, GPT-Realtime-Whisper, which gives users live speech-to-text capabilities that are captured as interactions occur
- All of the new voice models are included in OpenAI’s Realtime API
Summary
The company’s new GPT‑Realtime‑2 is another voice model, built to create a realistic vocal simulation that can converse with users. The company is also launching GPT‑Realtime‑Translate, which, as it sounds, is designed to provide real-time translation services that “keep pace” with the user, conversationally. Finally, the company has also launched a new transcription capability, GPT-Realtime-Whisper, which gives users live speech-to-text capabilities that are captured as interactions occur. “Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.