Groq
Ultra-fast AI inference on custom LPU hardware. Fastest token generation for Llama, Mixtral, and more.
๐ฐ Pricing
Free tier / Pay per token
โจ Key Features
๐ง AI Models Used
Llama 3.3 70B
Meta
Refined version of Llama 3.1 70B with performance approaching 405B while maintaining 70B efficiency.
Llama 4 Scout
Meta
Efficient MoE model with 109B total parameters. Fits on a single H100 GPU while delivering strong performance.
QwQ 32B
Alibaba
Reasoning-focused model that thinks step by step. Open-weight alternative to o1/o3 for reasoning tasks.
โญ Similar Tools
Hugging Face
Hugging Face
The GitHub of ML. Model hub, datasets, Spaces, and inference API for the open-source AI community.
Anthropic MCP
Anthropic
Model Context Protocol โ open standard for connecting AI to tools, data sources, and systems.
Ollama
Ollama
Run LLMs locally with one command. The simplest way to get Llama, Mistral, and other models running on your machine.