← Back to KHAO

Microsoft ·

Microsoft shivs OpenAI with new AI models for speech, images

2 min read

Compiled by KHAO Editorial — aggregated from 1 outlet. See llms.txt for citation guidance.

◌ Single Source

Image accompanies the article at The Register. No description was extracted from the source.

Microsoft on Thursday unveiled public preview versions of three home-baked machine learning models focused on speech recognition, speech synthesis, and image generation.

Key facts

Summary

The release makes the Windows biz look more like a direct competitor to OpenAI than an investor – Redmond held an OpenAI stake valued at about $135 billion as of last October. The models include: MAI-Transcribe-1, a speech recognition model that delivers "enterprise-grade accuracy across 25 languages at approximately 50 percent lower GPU cost than leading alternatives"; MAI-Voice-1, a speech generation model that can supposedly produce 60 seconds of audio in less than a second on a single GPU; and MAI-Image-2, a text-to-image model, to compound the despair of digital artists. OpenAI happens to offer its own speech recognition, speech generation, and text-to-image models. Microsoft's models are available through Foundry (formerly Azure AI Studio), a platform to develop AI agents and applications.

Read full article at The Register →

#microsoft #openai