Arcee AIArcee AI·text

Trinity Large Thinking

CodeReasoningWeb searchFunction calling
Quick reference
Trinity Large Thinking — TLDR
  • 🧠 Reasoning-optimized variant of Arcee AI's Trinity-Large 398B sparse MoE family.
  • ⚡ Roughly 13B active parameters per token for efficient inference.
  • 📏 256K context window for long, multi-step agentic chains.
  • 💬 Emits extended chain-of-thought inside reasoning-trace blocks.
  • 🔧 Tool calling and agentic RL post-training for long-horizon tasks.
  • 🌐 Multilingual training across 14 non-English languages.
  • 🔒 Released under Apache 2.0 in 2026.
  • 🏢 Built on Trinity-Large-Base; trained with Muon optimizer and SMEBU.
💰 Best price on AntSeed
$0.0027 / $0.009099%
per 1M · cheapest in / out
📏 Context
256K tokens
🐜 Sellers
6
advertising on AntSeed
Provider

Arcee AI is an artificial intelligence company focused on developing advanced language models. The organization has built a reputation in the open-source AI community for its work on model optimization and specialized text generation architectures.

About this model

Trinity Large Thinking is the reasoning-oriented member of Arcee AI's Trinity-Large series, a sparse Mixture-of-Experts model with roughly 398–400B total parameters and about 13B activated per token. It shares the same MoE architecture as the chat-focused Trinity-Large-Preview but is post-trained for extended chain-of-thought reasoning and agentic reinforcement learning, making it suited to long-horizon agents, multi-turn tool calling, and audit-friendly stepwise output.

The chief distinction from its same-family predecessors is reasoning behavior. Where Trinity-Large-Preview is lightly post-trained and chat-ready without trace output, Thinking emits intermediate reasoning inside dedicated reasoning-trace blocks before its final answer, and it is built on the Trinity-Large-Base foundation rather than being a fresh pretraining run.

Architecturally, the Trinity-Large family uses 256 experts with 4 active per token, interleaved local and global attention, gated attention, and sigmoid routing, according to Arcee's technical report. Training used the Muon optimizer plus a load-balancing technique called Soft-clamped Momentum Expert Bias Updates (SMEBU) across a 17-trillion-token pretraining recipe, completing with zero loss spikes.

The model supports tool calling, multilingual input, and a large context window for sustained agentic workflows. It is distributed under Apache 2.0, with FP8 weights and quantized GGUF builds available for self-hosting.

View source on GitHub ↗View model card on HuggingFace ↗
Sources
docs.arcee.aiTrinity-Large-Thinking | Arcee AI Documentation· docs.arcee.aiarcee.aiArcee AI | Trinity· arcee.aiarxiv.orgArcee Trinity Large Technical Report· arxiv.orghuggingface.coarcee-ai/Trinity-Large-Thinking · Hugging Face· huggingface.co

This About section is AI-generated from public sources via VeniceStats + Venice inference, with no human editing. It may contain inaccuracies.

Sellers serving Trinity Large Thinking (6)
SellerReputationInput $/MCached $/MOutput $/MCategoriesAPI

"Best price" and the seller table are live AntSeed catalog data (advertised $/1M tokens, not settled amounts). Reputation = on-chain trust (0-100). Model knowledge (TLDR, provider, About) via the VeniceStats enrichment layer. Advertised catalog, not the model used in any specific purchase.