Image Credits:David Ryder/Bloomberg / Getty Images9:48 AM PDT · April 2, 2026
Microsoft AI, the tech giant’s probe lab, announced the merchandise of three foundational AI models connected Thursday that tin make text, voice, and images.
The merchandise signals Microsoft’s continued propulsion to physique retired its ain stack of multimodal AI models — and vie with rival AI labs — adjacent though it remains tied to OpenAI.
MAI-Transcribe-1 transcribes code crossed 25 antithetic languages into substance and is 2.5 times faster than Microsoft’s Azure Fast offering, according to a institution property release. MAI-Voice-1 is an audio-generating model. This dependable exemplary allows users to make 60 seconds of audio successful 1 2nd and allows users to make a customized voice. MAI-Image-2 is simply a video-generating model.
MAI-Image-2 was primitively released connected MAI Playground, a caller ample connection exemplary investigating bundle connected March 19. Now, each 3 models are being released connected Microsoft Foundry and the transcription and dependable models are disposable successful MAI Playground arsenic well.
The models were developed by Microsoft’s MAI Superintelligence team, an AI probe squad led by Mustafa Suleyman, the CEO of Microsoft AI, that was formed and announced successful November 2025.
“At Microsoft AI, we’re gathering Humanist AI. We person a chiseled presumption erstwhile creating our AI models — putting humans astatine the center, optimizing for however radical really communicate, grooming for applicable use,” Suleyman wrote successful a blog post. “You’ll spot much models from america soon successful Foundry and straight successful Microsoft products and experiences.”
In an progressively crowded LLM market, MAI hopes a selling constituent for these models is that they are cheaper than those from Google and OpenAI, the institution wrote successful the blog post.
Techcrunch event
San Francisco, CA | October 13-15, 2026
MAI-Transcribe-1 starts astatine $0.36 per hour. MAI-Voice-1 starts astatine $22 per 1 cardinal characters, and MAI-Image-2 starts astatine $5 for 1 cardinal tokens for substance input and $33 for 1 cardinal tokens for representation output.
Despite releasing its ain models, Suleyman reaffirmed Microsoft’s committedness to its concern with OpenAI successful an interview with VentureBeat — though a caller renegotiation of that concern allowed Microsoft to genuinely prosecute this superintelligence research, Suleyman told The Verge.
Microsoft has invested much than $13 cardinal into the AI probe lab and hosts its models successful its assorted products done a multi-year partnership. Microsoft takes the aforesaid stance with chips; it some produces its ain and buys from extracurricular players arsenic well.
Becca is simply a elder writer astatine TechCrunch that covers task superior trends and startups. She antecedently covered the aforesaid bushed for Forbes and the Venture Capital Journal.
You tin interaction oregon verify outreach from Becca by emailing rebecca.szkutak@techcrunch.com.















English (US) ·