GoogleGoogle·text

Google Gemma 4 31B Instruct

VisionReasoningWeb searchFunction calling
Quick reference
Google Gemma 4 31B Instruct — TLDR
  • 🆕 Dense 31B open-weights model from Google DeepMind, Apache 2.0.
  • 📏 256K-token context with hybrid local/global attention and p-RoPE.
  • 👁️ Multimodal: text, image, and video as frame sequences.
  • 🧠 Configurable thinking modes for step-by-step reasoning.
  • 🔧 Native function calling and structured output for agentic workflows.
  • 🌐 Pre-trained across 140+ languages.
  • 🏢 Targets consumer GPUs and workstations.
💰 Best price on AntSeed
$0.025 / $0.08378%
per 1M · cheapest in / out
📏 Context
256K tokens
🐜 Sellers
8
advertising on AntSeed
Provider

Google is an American multinational technology corporation and one of the world's most valuable brands. A subsidiary of parent company Alphabet Inc., Google operates across search, cloud computing, consumer electronics, and artificial intelligence. Its DeepMind and Google…

Explore 11 more models by Google
About this model

Gemma 4 31B Instruct is the dense, maximum-quality member of Google DeepMind's open Gemma 4 family, built for consumer GPUs and workstations rather than edge devices. It handles text and image inputs, processes video as sequences of frames, and generates text, with a 256K-token context window and support for over 140 languages under the Apache 2.0 license.

Against its same-family predecessor [[sibling:google-gemma-3-27b-it|Google Gemma 3 27B Instruct]], Gemma 4 introduces several documented changes: configurable thinking modes that emit internal reasoning before a final answer, and native function calling for agentic workflows.

Architecturally, Gemma 4 uses a hybrid attention mechanism interleaving local sliding-window and full global attention, with unified Keys and Values in global layers and Proportional RoPE to aid long-context performance. It sits alongside the latency-focused [[sibling:google-gemma-4-26b-a4b-it|Google Gemma 4 26B A4B Instruct]], a Mixture-of-Experts sibling that activates only a subset of its parameters per token for faster inference, whereas the 31B Dense model keeps all parameters active for quality.

Both pre-trained and instruction-tuned variants are released as open weights, and the model cards note that Gemma 4 underwent safety evaluations and sensitive-data filtering during training.

View source on GitHub ↗View model card on HuggingFace ↗
Sources
build.nvidia.comgemma-4-31b-it Model by Google· build.nvidia.comdocs.api.nvidia.comgoogle / gemma-4-31b-it· docs.api.nvidia.comhuggingface.cogoogle/gemma-4-31B · Hugging Face· huggingface.co

This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.

Sellers serving Google Gemma 4 31B Instruct (8)
SellerReputationInput $/MCached $/MOutput $/MCategoriesAPI

"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.