Built in Rust · Recursive LLM Engine · Open Source

We need to
go deeper.

// Your LLM's LLM.

Altum is a Recursive Language Model engine. When your LLM runs out of context, it spawns a child agent with its own isolated sandbox. That child can spawn another. It is turtles all the way down until you have your answer.

Get started → 📄 Paper (PDF) View benchmarks

∞

recursion depth

LLM providers

12+

benchmark suites

🦀

written in Rust

altum · depth=3 · claude-sonnet-4

The problem

Context windows
are a lie.

128K tokens sounds like a lot. Until you need to analyze a 200-page document, track 50 steps of state, or recursively decompose a hard planning problem.

Suddenly you're truncating, chunking manually, and hallucinating the parts that didn't fit. The model runs out of brain.

Altum gives it more. When context overflows, the model calls llm_query() and a fresh sub-agent spawns with its own workspace. Divide. Delegate. Conquer.

context_window.monitor ● LIVE

Without recursion, which leads to context stuffing

127,847 / 128K

⚠ truncated · quality collapse · hallucination risk

With Altum using recursive delegation

depth:0 · 35K

depth:1 · 28K

depth:2 · 22K

✓ each agent focused · full precision · no truncation

Quality on hard reasoning tasks 50% → 100%

Long-context extraction fails → works

Hallucination risk high → guarded

Architecture

Strange loops,
all the way down.

Altum implements true recursive LLM execution. Each depth level is a complete agent with its own Python sandbox. This is a full sub-brain rather than a simple prompt chain or tool call.

Root RLM receives the task

At depth:0, the engine spawns a Python REPL sandbox. The model can write code, introspect the context, and decide whether it needs help or can proceed independently.

Delegate to a sub-agent

When the model needs deeper reasoning, it calls llm_query(chunk). Altum spawns a full child RLM at depth:1 with an isolated sandbox. The parent waits. The child thinks.

Recurse as deep as needed

Each child can spawn its own children, down to --max-depth. At the leaf level, calls hit the LLM directly. Divide. Delegate. Conquer. Results bubble back up the chain.

Aggregate and surface FINAL()

The root RLM collects sub-agent results, runs aggregation logic in its sandbox, and calls FINAL(answer). You get one clean answer. No assembly required.

RLM · depth:0 root

Orchestrator. Analyzes query, delegates chunks, aggregates results.

Python REPL llm_query() FINAL() VFS /plans/ CHECKPOINT

↓

spawns ↓

RLM · depth:1 sub-agent

Focused analyzer. Processes a context chunk. Can recurse further.

Python REPL llm_query() FINAL() VFS /evidence/

↓

spawns ↓

RLM · depth:2 leaf

Verifier / extractor. Hits LLM directly. Returns result up the chain.

Python REPL llm_query() → direct FINAL()

What you get

Built for the
hard problems.

⟳

True Recursive Reasoning

Not chain-of-thought. Not tool calls. Full sub-agents at every depth level, each with its own sandbox and decision loop. Inspired by divide-and-conquer algorithms applied to reasoning.

rlm · depth

⬡

Sandboxed Python REPL

Code runs in ouros, which is an isolated Python environment. No filesystem. No network. No surprises. Variables persist across iterations. The model thinks in code; the sandbox keeps it honest.

ouros · sandbox

⑂

Checkpoint & Fork

Branch your reasoning like git branches your code. CHECKPOINT_CREATE() snapshots sandbox state. FORK_CREATE() spawns parallel hypothesis paths. STRATEGY_COMMIT() picks the winner. Science, not vibes.

hypothesis · branching

◫

Virtual Filesystem

A model-visible scratchpad: /plans/ for strategy, /evidence/ for extracted facts, /artifacts/ for generated work. Think of it as version control for the model's reasoning process.

vfs · artifacts

⬡

Multi-Provider Routing

OpenAI, Anthropic, Gemini, xAI, Ollama. One engine to route them all. Auto-detects provider from model name prefix. Swap models without rewriting your workflow.

openai · claude · gemini · grok

◈

MCP Integration

Native Model Context Protocol server. Use Altum as a tool inside Claude Code, any MCP-compatible agent, or your own agentic workflow. Recursive reasoning as a service.

mcp · claude-code

Results

Numbers don't lie.

(Confidence intervals do. So we show those too.)

Hard Planning & Reasoning rlm_challenges.json

with recursion (depth=2) 100%

Tower of Hanoi, 8-Queens, Josephus, LIS, Edit Distance

no recursion (depth=0) 50%

Direct LLM, no decomposition

Long Context · Books + Distractors n=15 · 10K+ tokens

pass rate 73.3%

CI₉₅: [48.1%, 89.1%] calculated with a small N to provide honest error bars

Information-Dense Ledger n=18 · transaction aggregation

pass rate 61.1%

CI₉₅: [38.6%, 79.7%] from a deterministic eval without rubric games

Deep Recursion · Long Horizon depth=6 vs depth=0

depth=6 (max recursion) 100%

depth=0 (flat) 50%

+50pp improvement on long-horizon tasks requiring decomposition

⚗️

Science corner: These benchmarks use 12+ task suites covering cognitive traps, state tracking, multi-step reasoning, and long-context distributional tasks. We show confidence intervals because n=15 is a hypothesis rather than a proof. Codeforces-hard tasks still show 0% across all modes. We report that too.

View full benchmark suite →

Quick start

Up and running
in 60 seconds.

Build from source with Cargo. Set your API key. Pick a model. Altum handles recursion, sandboxing, and context management.

Need Claude Code integration? The MCP server is built in. Just configure the tool path and you're orchestrating recursive agents from inside your editor.

Requirements

Rust 1.75+

Python 3.9+ (for sandboxing)

API key for your preferred provider

            install
            basic usage
            advanced
            mcp setup
          

              shell
              
            
# clone and build
git clone https://github.com/Diogenesoftoronto/altum
cd altum
cargo build --release

# set your API key
export OPENAI_API_KEY="sk-..."
# or Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
# or Google
export GEMINI_API_KEY="AIza..."

              shell
              
            
# run a one-shot query with recursion depth 2
./target/release/altum run \
  --model claude-sonnet-4 \
  --max-depth 2 \
  --query "Analyze this document..."

# interactive chat mode with thread persistence
./target/release/altum chat \
  --model gemini-2.0-flash \
  --thread my-project

# upload context from file
./target/release/altum upload \
  --thread my-project \
  --file big-doc.txt

              shell
              
            
# deep recursion with strict depth enforcement
altum run \
  --model gpt-4o \
  --max-depth 4 \
  --depth-enforcement strict \
  --require-min-depth 3 \
  --query "..."

# enable checkpoints, forks, and VFS
altum run \
  --enable-checkpoints \
  --enable-forks \
  --enable-vfs \
  --max-iterations 20 \
  --model claude-sonnet-4 \
  --query "..."

# verbose mode to see every thought
altum run --verbose --model gemini-flash --query "..."

              json
              
            
// .claude/settings.json for Claude Code integration
{
  "mcpServers": {
    "altum": {
      "command": "/path/to/altum",
      "args": ["mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

We need togo deeper.

Context windowsare a lie.

Strange loops,all the way down.