Qwen 3 235B A22B Instruct 2507

Web searchFunction calling
Quick reference
Qwen 3 235B A22B Instruct 2507 — TLDR
  • - 🧠 Mixture-of-experts: 235B total parameters, 22B active per token.
  • - ⚡ Non-thinking variant: direct answers, no reasoning traces.
  • - 📏 Model card cites 256K native context, extendable toward 1M tokens.
  • - 🔧 Strong tool-calling and agentic use via Qwen-Agent and MCP.
  • - 🌐 Multilingual coverage across many languages and dialects.
  • - 🔒 Apache 2.0 license, offered here in FP8 quantization.
  • - 🏢 Built by Alibaba's Qwen team.
  • - 🎯 Aimed at long documents, technical work, high-precision tasks.
💰 Best price on AntSeed
FREE / FREE
per 1M · cheapest in / out
📏 Context
128K tokens
🐜 Sellers
7
advertising on AntSeed
Provider

Alibaba Group is a Chinese multinational technology company founded in 1999 and headquartered in Hangzhou, Zhejiang. Originally built around e-commerce and cloud computing, Alibaba has become one of the most prolific contributors to open-weight AI research, developing the Qwen…

Explore 24 more models by Alibaba Group
About this model

Qwen 3 235B A22B Instruct 2507 is a flagship Mixture-of-Experts model from Alibaba's Qwen team, released in 2025. It holds 235 billion total parameters but activates roughly 22 billion per forward pass, balancing capacity with inference cost. This is the instruction-tuned "non-thinking" line, meaning it returns direct responses without producing intermediate reasoning blocks, making outputs faster and more format-consistent than reasoning-chain variants. It carries an Apache 2.0 license and is offered here in FP8, which reduces memory footprint versus full precision.

The 2507 update is positioned as the refreshed version of the original Qwen3-235B-A22B non-thinking mode. Per Qwen's model card, it brings improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, multilingual understanding, and tool usage over that predecessor. The card also describes enhanced 256K long-context understanding, with configurations enabling ultra-long inputs toward one million tokens.

Its closest sibling is [[sibling:qwen3-235b-a22b-thinking-2507|Qwen 3 235B A22B Thinking 2507]], which shares the same architecture but generates explicit reasoning chains for complex problems, trading latency and token use for deeper deliberation. For vision and multimodal work, the family extends to [[sibling:qwen3-vl-235b-a22b|Qwen3 VL 235B]].

In practice, this Instruct variant suits high-throughput, latency-sensitive workloads — chatbots, API integrations, document analysis, and code generation — where consistent formatting matters more than visible step-by-step reasoning. Deployment is substantial, typically requiring multi-GPU tensor parallelism.

View source on GitHub ↗View model card on HuggingFace ↗
Sources
huggingface.coQwen/Qwen3-235B-A22B-Instruct-2507 · Hugging Face· huggingface.co

This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.

Sellers serving Qwen 3 235B A22B Instruct 2507 (7)
SellerReputationInput $/MCached $/MOutput $/MCategoriesAPI

"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.