Deepgram Unveils Aura-2: The World’s Most Professional, Cost-Effective, and Enterprise-Grade Text-to-Speech Model
SOURCE: AITHORITY.COM
APR 15, 2025
AI startup ElevenLabs launches Scribe model that converts voice to text and supports Ukrainian language with "excellent accuracy"
SOURCE: DEV.UA
FEB 28, 2025
ElevenLabs, an AI startup valued at $3.3 billion whose product was used to dub President Volodymyr Zelenskyy’s interview with US blogger Lex Friedman, has launched a new standalone model, Scribe, that supports Ukrainian, one of the languages ??with the lowest error rates.
As TechCrunch reports, ElevenLabs' Scribe model supports over 99 languages ??at launch. The company classifies over 25 languages ??as having «excellent accuracy» for the model, with a word error rate of less than 5%. This list includes English, Ukrainian, French, German, Hindi, Indonesian, Japanese, Polish, Portuguese, Spanish, Vietnamese, and others.
Other languages ??are divided into different categories:
The company said the model outperformed Google Gemini 2.0 Flash and Whisper Large V3 in FLEURS and Common Voice tests in various languages.
ElevenLabs developed a speech-to-text component for its AI conversational agent platform, which was released last year, but this is the first time the company has released a separate speech recognition model.
«We want to better understand what you’re saying in a conversation. We’re working to move beyond just generating content and into understanding and transcribing speech. Many people say that converting speech to text is a solved problem. But for many languages, it’s very bad. We believe we can build better speech recognition models because we have internal teams that annotate the data and give us quick feedback,» said CEO Mati Staniszewski.
The model also features intelligent speaker dialogization to tell the user who is speaking, word-level timestamping for accurate captioning, and automatic tagging of audio events such as audience laughter. The startup gives customers the ability to directly transcribe video content for subtitles or captioning in their studio.
Currently, Scribe only works with pre-recorded audio formats. The company says it will soon release a low-latency, real-time version of the model. This means it’s not yet effective for transcribing meetings or voice notes.
Scribe costs $0.40 per hour of transcribed audio. While that price is competitive, some of its competitors offer lower prices for audio transcription with some feature differentiation, TechCrunch notes.
Recall that in 2023, the startup ElevenLabs, which creates a universal machine for dubbing with artificial intelligence, added support for more than 20 languages. Among them were Ukrainian, Polish, Hindi, Portuguese, Spanish, Japanese, and Arabic.
In late January 2025, ElevenLabs raised $180 million in a new funding round and tripled its valuation to $3.3 billion. The Series C funding round was co-led by Andreessen Horowitz and Iconiq Growth with additional new investors NEA, World Innovation Lab, Valor, Endeavor Catalyst Fund, and Lunate.
LATEST NEWS
WHAT'S TRENDING
Data Science
5 Imaginative Data Science Projects That Can Make Your Portfolio Stand Out
OCT 05, 2022
SOURCE: AITHORITY.COM
APR 15, 2025
SOURCE: SILICONANGLE.COM
APR 02, 2025
SOURCE: ANALYTICSINDIAMAG.COM
MAR 27, 2025
SOURCE: PRWEB.COM
MAR 28, 2025