Groq
Ultra-fast AI inference on custom LPU hardware. Fastest token generation for Llama, Mixtral, and more.
๐ฐ Pricing
Free tier / Pay per token
โจ Key Features
๐ง AI Models Used
Llama 3.3 70B
Meta
Refined version of Llama 3.1 70B with performance approaching 405B while maintaining 70B efficiency.
Llama 4 Scout
Meta
Efficient MoE model with 109B total parameters. Fits on a single H100 GPU while delivering strong performance.
QwQ 32B
Alibaba
Reasoning-focused model that thinks step by step. Open-weight alternative to o1/o3 for reasoning tasks.
โญ Similar Tools
Hugging Face
Hugging Face
The GitHub of ML. Model hub, datasets, Spaces, and inference API for the open-source AI community.
Anthropic MCP
Anthropic
Model Context Protocol โ open standard for connecting AI to tools, data sources, and systems.
OpenRouter
OpenRouter
Unified API for 200+ AI models. One API key, automatic fallbacks, and usage-based pricing across providers.