Mistral AI has released a new family of AI models that it claims will clear the path to seamless conversation between people speaking different languages.
On Wednesday, the Paris-based AI lab released two new speech-to-text models: Voxtral Mini Transcribe V2 and Voxtral Realtime. The former is built to transcribe audio files in large batches and the latter for nearly real-time transcription, within 200 milliseconds; both can translate between 13 languages. Voxtral Realtime is freely available under an open source license.
At 4 billion parameters, the models are small enough to run locally on a phone or laptop—a first in the speech-to-text field, Mistral claims—meaning that private conversations
→ Continue reading at WIRED