Z.aiZ.ai·text

GLM 4.7 Flash

ReasoningWeb searchFunction calling
Quick reference
GLM 4.7 Flash — TLDR
  • 🆕 Fast-inference variant of GLM-4.7, tuned for speed.
  • 🧠 30B-A3B Mixture-of-Experts with roughly 3B active parameters.
  • 📏 128K-token context window for long inputs.
  • 🔧 Reasoning, function-calling, and web search supported.
  • ⚡ Optimized for quick responses on latency-sensitive workloads.
  • 🔒 Open-weight under the MIT license, fp8 quantized.
  • 🏢 Built by Z.ai (formerly Zhipu AI), released January 2026.
💰 Best price on AntSeed
$0.0012 / $0.004999%
per 1M · cheapest in / out
📏 Context
128K tokens
🐜 Sellers
7
advertising on AntSeed
Provider

Z.ai, formally Knowledge Atlas Technology Joint Stock Co., Ltd., is a Chinese technology company specializing in artificial intelligence. Previously known internationally as Zhipu AI, the company rebranded to Z.ai in 2025. Its core focus is the GLM family of large language…

Explore 10 more models by Z.ai
About this model

GLM 4.7 Flash is the speed-optimized member of Z.ai's GLM-4.7 generation, positioned as a lightweight companion to the full [[sibling:zai-org-glm-4.7|GLM 4.7]]. It is a 30B-A3B Mixture-of-Experts model, activating roughly 3B parameters per token. The model ships open-weight under the MIT license with a 128K-token context window and fp8 quantization.

Compared with its same-family parent, Flash is tuned for faster, cheaper inference while inheriting the 4.7 generation's coding, tool-calling, and multi-step reasoning behaviors. Z.ai exposes reasoning, function-calling, and web-search capabilities, making it suitable for agentic and tool-augmented workflows.

Because it is tuned for throughput, Flash is best suited to high-volume, latency-sensitive, or background tasks, whereas the full GLM 4.7 remains the heavier option for the most demanding prompts.

Within the broader family, GLM 4.7 Flash follows earlier releases such as [[sibling:zai-org-glm-4.6|GLM 4.6]] and sits alongside the larger [[sibling:zai-org-glm-5|GLM 5]] line. It is distributed via Hugging Face, where the open weights and model card are published for self-hosting and deployment.

View source on GitHub ↗View model card on HuggingFace ↗
Sources
huggingface.cozai-org/GLM-4.7-Flash · Hugging Face· huggingface.co

This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.

Sellers serving GLM 4.7 Flash (7)
SellerReputationInput $/MCached $/MOutput $/MCategoriesAPI

"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.