โ Back to all models
Latest speech recognition model with improved accuracy across 100+ languages, real-time streaming, and speaker diarization.
#speech-to-text#multilingual#open-weight#diarization
๐งฎ
2B
Parameters
๐
N/A
Context Window
๐
Open Source
License
๐
Oct 1, 2025
Released
๐ฐ Pricing
Input
$0.006/min
per 1M tokens
Output
N/A
per 1M tokens
โก Strengths
โ100+ languages
โSpeaker diarization
โReal-time streaming
โOpen weights
๐ฏ Use Cases
TranscriptionMeeting notesSubtitlesVoice apps
๐
API Available
This model is accessible via API for integration into your applications.
โญ Related Models
๐
GPT-5
OpenAI
๐ฅ 96
OpenAI's latest flagship. Advanced multimodal capabilities with native tool use, improved reasoning, and massive knowledge.
๐ proprietary๐ 256K ctx๐ฐ $10
Dec 1, 2025View details โ
๐
GPT-4o
OpenAI
๐ฅ 82
Omni model supporting text, vision, and audio natively. Fast and capable with strong multimodal understanding.
๐ proprietary๐ 128K ctx๐ฐ $2.50
May 13, 2024View details โ
๐ฌ
GPT-4o Mini
OpenAI
๐ฅ 70
Compact and affordable. Surprisingly capable for its price point, ideal for high-volume applications.
๐ proprietary๐ 128K ctx๐ฐ $0.15
Jul 18, 2024View details โ