Built in Rust · Recursive LLM Engine · Open Source

We need to
go deeper.

// Your LLM's LLM.

Altum is a Recursive Language Model engine. When your LLM runs out of context, it spawns a child agent with its own isolated sandbox. That child can spawn another. It is turtles all the way down until you have your answer.

recursion depth
6
LLM providers
12+
benchmark suites
🦀
written in Rust
altum · depth=3 · claude-sonnet-4
The problem

Context windows
are a lie.

128K tokens sounds like a lot. Until you need to analyze a 200-page document, track 50 steps of state, or recursively decompose a hard planning problem.

Suddenly you're truncating, chunking manually, and hallucinating the parts that didn't fit. The model runs out of brain.

Altum gives it more. When context overflows, the model calls llm_query() and a fresh sub-agent spawns with its own workspace. Divide. Delegate. Conquer.

context_window.monitor ● LIVE
Without recursion, which leads to context stuffing
127,847 / 128K
⚠ truncated · quality collapse · hallucination risk
With Altum using recursive delegation
depth:0 · 35K
depth:1 · 28K
depth:2 · 22K
✓ each agent focused · full precision · no truncation
Quality on hard reasoning tasks 50%100%
Long-context extraction failsworks
Hallucination risk highguarded
Architecture

Strange loops,
all the way down.

Altum implements true recursive LLM execution. Each depth level is a complete agent with its own Python sandbox. This is a full sub-brain rather than a simple prompt chain or tool call.

01

Root RLM receives the task

At depth:0, the engine spawns a Python REPL sandbox. The model can write code, introspect the context, and decide whether it needs help or can proceed independently.

02

Delegate to a sub-agent

When the model needs deeper reasoning, it calls llm_query(chunk). Altum spawns a full child RLM at depth:1 with an isolated sandbox. The parent waits. The child thinks.

03

Recurse as deep as needed

Each child can spawn its own children, down to --max-depth. At the leaf level, calls hit the LLM directly. Divide. Delegate. Conquer. Results bubble back up the chain.

04

Aggregate and surface FINAL()

The root RLM collects sub-agent results, runs aggregation logic in its sandbox, and calls FINAL(answer). You get one clean answer. No assembly required.

RLM · depth:0 root

Orchestrator. Analyzes query, delegates chunks, aggregates results.

Python REPL llm_query() FINAL() VFS /plans/ CHECKPOINT
spawns ↓
RLM · depth:1 sub-agent

Focused analyzer. Processes a context chunk. Can recurse further.

Python REPL llm_query() FINAL() VFS /evidence/
spawns ↓
RLM · depth:2 leaf

Verifier / extractor. Hits LLM directly. Returns result up the chain.

Python REPL llm_query() → direct FINAL()
What you get

Built for the
hard problems.

True Recursive Reasoning

Not chain-of-thought. Not tool calls. Full sub-agents at every depth level, each with its own sandbox and decision loop. Inspired by divide-and-conquer algorithms applied to reasoning.

rlm · depth

Sandboxed Python REPL

Code runs in ouros, which is an isolated Python environment. No filesystem. No network. No surprises. Variables persist across iterations. The model thinks in code; the sandbox keeps it honest.

ouros · sandbox

Checkpoint & Fork

Branch your reasoning like git branches your code. CHECKPOINT_CREATE() snapshots sandbox state. FORK_CREATE() spawns parallel hypothesis paths. STRATEGY_COMMIT() picks the winner. Science, not vibes.

hypothesis · branching

Virtual Filesystem

A model-visible scratchpad: /plans/ for strategy, /evidence/ for extracted facts, /artifacts/ for generated work. Think of it as version control for the model's reasoning process.

vfs · artifacts

Multi-Provider Routing

OpenAI, Anthropic, Gemini, xAI, Ollama. One engine to route them all. Auto-detects provider from model name prefix. Swap models without rewriting your workflow.

openai · claude · gemini · grok

MCP Integration

Native Model Context Protocol server. Use Altum as a tool inside Claude Code, any MCP-compatible agent, or your own agentic workflow. Recursive reasoning as a service.

mcp · claude-code
Results

Numbers don't lie.

(Confidence intervals do. So we show those too.)

Hard Planning & Reasoning rlm_challenges.json
with recursion (depth=2) 100%
Tower of Hanoi, 8-Queens, Josephus, LIS, Edit Distance
no recursion (depth=0) 50%
Direct LLM, no decomposition
Long Context · Books + Distractors n=15 · 10K+ tokens
pass rate 73.3%
CI₉₅: [48.1%, 89.1%] calculated with a small N to provide honest error bars
Information-Dense Ledger n=18 · transaction aggregation
pass rate 61.1%
CI₉₅: [38.6%, 79.7%] from a deterministic eval without rubric games
Deep Recursion · Long Horizon depth=6 vs depth=0
depth=6 (max recursion) 100%
depth=0 (flat) 50%
+50pp improvement on long-horizon tasks requiring decomposition
⚗️
Science corner: These benchmarks use 12+ task suites covering cognitive traps, state tracking, multi-step reasoning, and long-context distributional tasks. We show confidence intervals because n=15 is a hypothesis rather than a proof. Codeforces-hard tasks still show 0% across all modes. We report that too.

View full benchmark suite →
Providers

One engine.
Every frontier model.

Auto-detect from model name. Zero config switching. Bring your own keys.

OpenAI gpt-4o · o1 · o3
Anthropic claude-3 · claude-4
Google gemini-1.5 · 2.0
xAI grok-2 · grok-3
Ollama any local model
+
Custom --base-url · any API
Quick start

Up and running
in 60 seconds.

Build from source with Cargo. Set your API key. Pick a model. Altum handles recursion, sandboxing, and context management.

Need Claude Code integration? The MCP server is built in. Just configure the tool path and you're orchestrating recursive agents from inside your editor.

Requirements
Rust 1.75+
Python 3.9+ (for sandboxing)
API key for your preferred provider
install basic usage advanced mcp setup
shell
# clone and build
git clone https://github.com/Diogenesoftoronto/altum
cd altum
cargo build --release

# set your API key
export OPENAI_API_KEY="sk-..."
# or Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
# or Google
export GEMINI_API_KEY="AIza..."
shell
# run a one-shot query with recursion depth 2
./target/release/altum run \
  --model claude-sonnet-4 \
  --max-depth 2 \
  --query "Analyze this document..."

# interactive chat mode with thread persistence
./target/release/altum chat \
  --model gemini-2.0-flash \
  --thread my-project

# upload context from file
./target/release/altum upload \
  --thread my-project \
  --file big-doc.txt
shell
# deep recursion with strict depth enforcement
altum run \
  --model gpt-4o \
  --max-depth 4 \
  --depth-enforcement strict \
  --require-min-depth 3 \
  --query "..."

# enable checkpoints, forks, and VFS
altum run \
  --enable-checkpoints \
  --enable-forks \
  --enable-vfs \
  --max-iterations 20 \
  --model claude-sonnet-4 \
  --query "..."

# verbose mode to see every thought
altum run --verbose --model gemini-flash --query "..."
json
// .claude/settings.json for Claude Code integration
{
  "mcpServers": {
    "altum": {
      "command": "/path/to/altum",
      "args": ["mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}
"The system knows itself. The loop is strange. The answer emerges."
Inspired by Douglas Hofstadter, Gödel, Escher, Bach (1979) · and now, Altum (2025)