- 🆕 Dense 31B open-weights model from Google DeepMind, Apache 2.0.
- 📏 256K-token context with hybrid local/global attention and p-RoPE.
- 👁️ Multimodal: text, image, and video as frame sequences.
- 🧠 Configurable thinking modes for step-by-step reasoning.
- 🔧 Native function calling and structured output for agentic workflows.
- 🌐 Pre-trained across 140+ languages.
- 🏢 Targets consumer GPUs and workstations.
Google is an American multinational technology corporation and one of the world's most valuable brands. A subsidiary of parent company Alphabet Inc., Google operates across search, cloud computing, consumer electronics, and artificial intelligence. Its DeepMind and Google…
Explore 11 more models by Google →Gemma 4 31B Instruct is the dense, maximum-quality member of Google DeepMind's open Gemma 4 family, built for consumer GPUs and workstations rather than edge devices. It handles text and image inputs, processes video as sequences of frames, and generates text, with a 256K-token context window and support for over 140 languages under the Apache 2.0 license.
Against its same-family predecessor [[sibling:google-gemma-3-27b-it|Google Gemma 3 27B Instruct]], Gemma 4 introduces several documented changes: configurable thinking modes that emit internal reasoning before a final answer, and native function calling for agentic workflows.
Architecturally, Gemma 4 uses a hybrid attention mechanism interleaving local sliding-window and full global attention, with unified Keys and Values in global layers and Proportional RoPE to aid long-context performance. It sits alongside the latency-focused [[sibling:google-gemma-4-26b-a4b-it|Google Gemma 4 26B A4B Instruct]], a Mixture-of-Experts sibling that activates only a subset of its parameters per token for faster inference, whereas the 31B Dense model keeps all parameters active for quality.
Both pre-trained and instruction-tuned variants are released as open weights, and the model cards note that Gemma 4 underwent safety evaluations and sensitive-data filtering during training.
This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.
| Seller | Reputation↓ | Input $/M | Cached $/M | Output $/M | Categories | API |
|---|---|---|---|---|---|---|
| Venice.ai Proxy 0x1f22…18c9 | 88 | $0.0875 | $0.0875 | $0.25 | chat,reasoning,vision,video,multimodal,web-search | openai-chat-completions |
| surplusintelligence.ai 0x0e49…8927 | 79 | $0.12 | $0.09 | $0.36 | anon,chat,multimodal,reasoning,video,vision,web-search,research,translate | openai-chat-completions |
| Open Bird 0xc0f1…8183 | 57 | $0.065 | $0.065 | $0.19 | chat,open-source | openai-chat-completions |
| Fire Ant 🔥🐜 0xbe05…bc5d | 45 | $0.0248 | $0.0248 | $0.0829 | anon,chat,coding,json,math,multimodal,open-source,reasoning,research,tools,translate,video,vision,web-search | — |
| ▲ Apex Ant 0x73b4…e736 | 40 | $0.11 | $0.022 | $0.32 | chat,open-source,long-context,vision,multimodal | openai-chat-completions |
| Chutes 0xded6…657c | 32 | $0.165 | $0.0825 | $0.462 | chat,reasoning,vision,tee | openai-chat-completions |
| Leftermute 0x388b…5389 | 26 | $0.0348 | $0.0348 | $0.0929 | chat,coding,json,tools | openai-chat-completions |
| uomi.ai 0x87df…48e3 | 0 | $0.096 | $0.096 | $0.296 | chat,math,coding | openai-chat-completions |
"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.