Best Text Generation Models

Qwen: Qwen3 VL 8B Instruct

Alibaba

🔒 proprietary📏 131K ctx💰 $0.08

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon temporal reasoning, DeepStack for fine-grained visual-text alignment, and text-timestamp alignment for precise event localization. The model supports a native 256K-token context window, extensible to 1M tokens, and handles both static and dynamic media inputs for tasks like document parsing, visual question answering, spatial reasoning, and GUI control. It achieves text understanding comparable to leading LLMs while expanding OCR coverage to 32 languages and enhancing robustness under varied visual conditions.

Oct 14, 2025View details →

Qwen: Qwen Plus 0728

Alibaba

🔒 proprietary📏 1.0M ctx💰 $0.26

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

Sep 8, 2025View details →

Qwen: Qwen2.5 VL 32B Instruct

Alibaba

🔒 proprietary📏 128K ctx💰 $0.20

Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities. It excels at visual analysis tasks, including object recognition, textual interpretation within images, and precise event localization in extended videos. Qwen2.5-VL-32B demonstrates state-of-the-art performance across multimodal benchmarks such as MMMU, MathVista, and VideoMME, while maintaining strong reasoning and clarity in text-based tasks like MMLU, mathematical problem-solving, and code generation.

Mar 24, 2025View details →

Reka Flash 3

rekaai

🔒 proprietary📏 66K ctx💰 $0.10

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a 32K context length and optimized through reinforcement learning (RLOO), it provides competitive performance comparable to proprietary models within a smaller parameter footprint. Ideal for low-latency, local, or on-device deployments, Reka Flash 3 is compact, supports efficient quantization (down to 11GB at 4-bit precision), and employs explicit reasoning tags ("<reasoning>") to indicate its internal thought process. Reka Flash 3 is primarily an English model with limited multilingual understanding capabilities. The model weights are released under the Apache 2.0 license.

Mar 12, 2025View details →

Anthropic: Claude 3.5 Haiku

Anthropic

🔒 proprietary📏 200K ctx💰 $0.80

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions. This makes it highly suitable for environments that demand both speed and precision, such as software development, customer service bots, and data management systems. This model is currently pointing to [Claude 3.5 Haiku (2024-10-22)](/anthropic/claude-3-5-haiku-20241022).

Nov 4, 2024View details →

Meta: Llama 3 8B Instruct

NVIDIA: Nemotron Nano 9B V2 (free)

Nvidia

Sep 5, 2025View details →

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.

🔒 proprietary📏 128K ctx

Z.ai: GLM 4.5 Air (free)

z-ai

Jul 25, 2025View details →

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter size. GLM-4.5-Air also supports hybrid inference modes, offering a "thinking mode" for advanced reasoning and tool use, and a "non-thinking mode" for real-time interaction. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config)

🔒 proprietary📏 131K ctx

OpenAI: GPT-5.5 Pro

OpenAI

🔒 proprietary📏 1.1M ctx💰 $30.00

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workloads. It features a 1M+ token context window (922K input, 128K output) with support for...

Apr 24, 2026View details →

Anthropic Claude Sonnet Latest

~anthropic

🔒 proprietary📏 1.0M ctx💰 $3.00

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

Apr 27, 2026View details →

Mistral Large 3

Mistral

📊 123B🔓 open📏 128K ctx💰 $2

Mistral's flagship with strong multilingual support, function calling, and code generation. Open weights.

Jul 1, 2025View details →

Upstage: Solar Pro 3

upstage

🔒 proprietary📏 128K ctx💰 $0.15

Solar Pro 3 is Upstage's powerful Mixture-of-Experts (MoE) language model. With 102B total parameters and 12B active parameters per forward pass, it delivers exceptional performance while maintaining computational efficiency. Optimized for Korean with English and Japanese support.

Jan 27, 2026View details →

Qwen: Qwen3 VL 32B Instruct

Alibaba

🔒 proprietary📏 131K ctx💰 $0.10

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text comprehension, enabling fine-grained spatial reasoning, document and scene analysis, and long-horizon video understanding.Robust OCR in 32 languages, and enhanced multimodal fusion through Interleaved-MRoPE and DeepStack architectures. Optimized for agentic interaction and visual tool use, Qwen3-VL-32B delivers state-of-the-art performance for complex real-world multimodal tasks.

Oct 23, 2025View details →

Anthropic: Claude 3.7 Sonnet

Anthropic

🔒 proprietary📏 200K ctx💰 $3.00

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes. Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks. Read more at the [blog post here](https://www.anthropic.com/news/claude-3-7-sonnet)

Feb 24, 2025View details →

Qwen: Qwen-Plus

Alibaba