โ† Back to all models
๐Ÿ’ฌ

Z.ai: GLM 4.6V

z-aiยทText Generation
๐Ÿ”ฅ 74trending

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across images, documents, and mixed media. It supports up to 128K tokens, processes complex page layouts and charts directly as visual inputs, and integrates native multimodal function calling to connect perception with downstream tool execution. The model also enables interleaved image-text generation and UI reconstruction workflows, including screenshot-to-HTML synthesis and iterative visual editing.

#text+image+video->text#top-provider
๐Ÿงฎ

Undisclosed

Parameters

๐Ÿ“

131K tokens

Context Window

๐Ÿ”’

Proprietary

License

๐Ÿ“…

Dec 8, 2025

Released

๐Ÿ’ฐ Pricing

Input

$0.30

per 1M tokens

Output

$0.90

per 1M tokens

๐Ÿ”Œ

API Available

This model is accessible via API for integration into your applications.