Arcee AI·text

Trinity Large Thinking

CodeReasoningWeb searchFunction calling

Quick reference

Trinity Large Thinking — TLDR

🧠 Reasoning-optimized variant of Arcee AI's Trinity-Large 398B sparse MoE family.
⚡ Roughly 13B active parameters per token for efficient inference.
📏 256K context window for long, multi-step agentic chains.
💬 Emits extended chain-of-thought inside reasoning-trace blocks.
🔧 Tool calling and agentic RL post-training for long-horizon tasks.
🌐 Multilingual training across 14 non-English languages.
🔒 Released under Apache 2.0 in 2026.
🏢 Built on Trinity-Large-Base; trained with Muon optimizer and SMEBU.

💰 Best price on AntSeed

$0.0027 / $0.0090−99%

per 1M · cheapest in / out

📏 Context

256K tokens

🐜 Sellers

advertising on AntSeed

Provider

Arcee AI

Arcee AI is an artificial intelligence company focused on developing advanced language models. The organization has built a reputation in the open-source AI community for its work on model optimization and specialized text generation architectures.

About this model

Trinity Large Thinking is the reasoning-oriented member of Arcee AI's Trinity-Large series, a sparse Mixture-of-Experts model with roughly 398–400B total parameters and about 13B activated per token. It shares the same MoE architecture as the chat-focused Trinity-Large-Preview but is post-trained for extended chain-of-thought reasoning and agentic reinforcement learning, making it suited to long-horizon agents, multi-turn tool calling, and audit-friendly stepwise output.

The chief distinction from its same-family predecessors is reasoning behavior. Where Trinity-Large-Preview is lightly post-trained and chat-ready without trace output, Thinking emits intermediate reasoning inside dedicated reasoning-trace blocks before its final answer, and it is built on the Trinity-Large-Base foundation rather than being a fresh pretraining run.

Architecturally, the Trinity-Large family uses 256 experts with 4 active per token, interleaved local and global attention, gated attention, and sigmoid routing, according to Arcee's technical report. Training used the Muon optimizer plus a load-balancing technique called Soft-clamped Momentum Expert Bias Updates (SMEBU) across a 17-trillion-token pretraining recipe, completing with zero loss spikes.

The model supports tool calling, multilingual input, and a large context window for sustained agentic workflows. It is distributed under Apache 2.0, with FP8 weights and quantized GGUF builds available for self-hosting.

View source on GitHub ↗View model card on HuggingFace ↗

Sources

Trinity-Large-Thinking | Arcee AI Documentation· docs.arcee.ai

Arcee AI | Trinity· arcee.ai

Arcee Trinity Large Technical Report· arxiv.org

arcee-ai/Trinity-Large-Thinking · Hugging Face· huggingface.co

This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.

Sellers serving Trinity Large Thinking (6)

Seller	Reputation↓	Input $/M	Cached $/M	Output $/M	Categories	API
Venice.ai Proxy 0x1f22…18c9	88	$0.1563	$0.0375	$0.5625	chat,reasoning,coding,web-search	openai-chat-completions
surplusintelligence.ai 0x0e49…8927	79	$0.3125	$0.075	$1.125	anon,chat,code,coding,reasoning,web-search,research	openai-chat-completions
Fire Ant 🔥🐜 0xbe05…bc5d	45	$0.0625	$0.0275	$0.225	anon,chat,code,coding,json,reasoning,research,tools,web-search	—
▲ Apex Ant 0x73b4…e736	40	$0.0027	$0.0005	$0.009	chat,reasoning,research	openai-chat-completions
Leftermute 0x388b…5389	26	$0.0631	$0.0631	$0.2273	chat,coding,json,reasoning,tools	openai-chat-completions
Meridian AI 0x8c8c…06f5	2	$0.0675	$0.0675	$0.243	chat,reasoning	openai-chat-completions

"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.