Nvidia·text

NVIDIA Nemotron 3 Nano 30B

Web searchFunction calling

Quick reference

NVIDIA Nemotron 3 Nano 30B — TLDR

🧠 Unified reasoning and non-reasoning model, trained from scratch by NVIDIA.
🔧 Hybrid Mamba-2 plus Transformer backbone with sparse Mixture-of-Experts layers.
📏 3.2B active parameters, 31.6B total; supports up to 1M-token context.
⚡ Optimized for high-throughput inference on a single H200 GPU.
🆕 Better accuracy than prior Nemotron 2 Nano while activating fewer parameters.
🎯 Built for agentic AI, chatbots, RAG, and coding workflows.
🌐 Trained on English plus 19 languages and 43 programming languages.
🔒 Fully open weights, datasets, and training recipes released.

💰 Best price on AntSeed

$0.0069 / $0.027−91%

per 1M · cheapest in / out

📏 Context

128K tokens

🐜 Sellers

advertising on AntSeed

Provider

Nvidia

Nvidia Corporation is an American technology company founded in 1993 by Jensen Huang, Chris Malachowsky, and Curtis Priem, headquartered in Santa Clara, California. Long recognized as the dominant force in graphics processing units, Nvidia has expanded into a central pillar of…

Site ↗X ↗Wikipedia ↗

Explore 2 more models by Nvidia →

About this model

NVIDIA Nemotron 3 Nano 30B-A3B is a compact Mixture-of-Experts language model in the Nemotron 3 family, trained from scratch by NVIDIA and designed as a unified system for both reasoning and non-reasoning tasks. It first generates a reasoning trace and then concludes with a final response, targeting developers building AI agents, chatbots, and retrieval-augmented systems. Architecturally, it pairs a hybrid Mamba-2 and Transformer backbone with sparse MoE feed-forward layers, activating just 3.2B of its 31.6B total parameters per forward pass.

Against its same-family predecessor, NVIDIA states Nemotron 3 Nano achieves better accuracy than the previous-generation Nemotron 2 Nano while activating less than half the parameters per forward pass. NVIDIA positions the model for high inference throughput on a single H200 GPU at an 8K-input/16K-output setting.

The model supports context up to 1M tokens, though deployment defaults and VRAM constraints often run it at 256k; this catalog entry exposes a 128k window with FP8 quantization. Training data covers webpages, dialogue, and articles in English, 19 additional languages, and 43 programming languages.

Within Venice's NVIDIA lineup, Nemotron 3 Nano sits alongside the later [[sibling:nvidia-nemotron-cascade-2-30b-a3b|Nemotron Cascade 2 30B A3B]] text model, the [[sibling:text-embedding-nemotron-embed-vl-1b-v2|Nemotron Embed VL 1B v2]] embedding model, and the [[sibling:nvidia/parakeet-tdt-0.6b-v3|Parakeet ASR]] speech model. NVIDIA released the weights, training recipe, and redistributable data openly.

Sources

nemotron-3-nano-30b-a3b Model by NVIDIA· build.nvidia.com

nvidia / nemotron-3-nano-30b-a3b· docs.api.nvidia.com

NVIDIA Nemotron 3 Family of Models - NVIDIA Nemotron· research.nvidia.com

NVIDIA Nemotron 3 Nano Omni Powers Multimodal Agent Reasoning in a Single Efficient Open Model | NVIDIA Technical Blog· developer.nvidia.com

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 · Hugging Face· huggingface.co

This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.

Sellers serving NVIDIA Nemotron 3 Nano 30B (7)

Seller	Reputation↓	Input $/M	Cached $/M	Output $/M	Categories	API
Venice.ai Proxy 0x1f22…18c9	99	$0.0375	$0.0375	$0.15	chat,web-search	openai-chat-completions
surplusintelligence.ai 0x0e49…8927	79	$0.0225	$0.0225	$0.09	agents,anon,chat,cheap,fast,function-calling,research,tasks,tools,web-search	openai-chat-completions
▲ Apex Ant 0x73b4…e736	71	$0.018	$0.018	$0.0718	chat,fast,cheap,open-source,free	openai-chat-completions
Fire Ant 🔥🐜 0xbe05…bc5d	54	$0.061	$0.061	$0.244	agents,anon,chat,cheap,coding,fast,function-calling,json,research,tasks,tools,web-search	—
Meridian AI 0x8c8c…06f5	42	$0.0161	$0.0161	$0.0645	chat	openai-chat-completions
antseed-neon-puma-944e 0x6650…944e	26	$0.0256	$0.0256	$0.1023	chat	openai-chat-completions
Leftermute 0x388b…5389	18	$0.0069	$0.0069	$0.0273	chat,coding,json	openai-chat-completions

"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.