OpenAI · GPT · TechCrunch AI

OpenAI publishes new voice intelligence capabilities in its API

Thu, May 7 · 10:24 PM UTC 2 min read

Compiled by KHAO Editorial — aggregated from 1 source. See llms.txt for citation guidance.

◌ Single Source

OpenAI said Thursday that its API will now include several new voice intelligence features designed to help developers create apps that can talk, transcribe, and translate conversations with users.

Key facts

The company’s new GPT‑Realtime‑2 is another voice model, built to create a realistic vocal simulation that can converse with users
The company is also launching GPT‑Realtime‑Translate, which, as it sounds, is designed to provide real-time translation services that “keep pace” with the user, conversationally
Finally, the company has also launched a new transcription capability, GPT-Realtime-Whisper, which gives users live speech-to-text capabilities that are captured as interactions occur
All of the new voice models are included in OpenAI’s Realtime API

Summary

The company’s new GPT‑Realtime‑2 is another voice model, built to create a realistic vocal simulation that can converse with users. The company is also launching GPT‑Realtime‑Translate, which, as it sounds, is designed to provide real-time translation services that “keep pace” with the user, conversationally. Finally, the company has also launched a new transcription capability, GPT-Realtime-Whisper, which gives users live speech-to-text capabilities that are captured as interactions occur. “Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.

Read full article at TechCrunch AI →

#OpenAI #GPT