The Business & Technology Network
Helping Business Interpret and Use Technology
«  
  »
S M T W T F S
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
 
 
 
 

Mistral unveils open source TTS model for voice agents

DATE POSTED:March 30, 2026
Mistral unveils open source TTS model for voice agents

French AI company Mistral has launched Voxtral TTS, an open source text-to-speech model designed for voice AI assistants and enterprise applications such as customer support. The model targets businesses looking to build voice agents for sales and engagement while placing Mistral in competition with companies like ElevenLabs and OpenAI.

Voxtral TTS supports nine languages: English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. The model aims to meet customer demands for a speech model, according to Pierre Stock, VP of science operations at Mistral AI.

Stock stated, “We built a small-sized speech model that can fit on a smartwatch, a smartphone, a laptop, or other edge devices. The cost of it is a fraction of anything else on the market, but it offers state-of-the-art performance.”

The model can adapt to a custom voice with a sample of less than five seconds, capturing accents, inflections, intonations, and speech irregularities. Voxtral TTS is based on Mistral 3B and can switch languages without losing voice characteristics, which benefits use cases like dubbing and real-time translation.

Voxtral TTS is built for real-time functionality, with a time-to-first-audio (TTFA) of 90 milliseconds for a 10-second sample of 500 characters. The model features a real-time factor (RTF) of 6x, allowing it to render a 10-second clip in approximately 1.6 seconds.

Earlier in 2023, Mistral launched two transcription models aimed at large batch processing and low-latency real-time applications. The introduction of Voxtral TTS aligns with Mistral’s goal to create a complete suite of voice products for enterprise use.

Stock added, “We plan to have an end-to-end platform that can handle multimodal streams of input, including audio, text, and image.” This aims to provide richer information through a system capable of supporting various data types.

Mistral emphasizes that its open-source and customization capabilities are intended to encourage enterprises to adopt its models over competitors. Companies will have the ability to tailor the technology to their needs, offering potential advantages in customer engagement.

Featured image credit