// Your LLM's LLM.
Altum is a Recursive Language Model engine. When your LLM runs out of context, it spawns a child agent with its own isolated sandbox. That child can spawn another. It is turtles all the way down until you have your answer.
128K tokens sounds like a lot. Until you need to analyze a 200-page document, track 50 steps of state, or recursively decompose a hard planning problem.
Suddenly you're truncating, chunking manually, and hallucinating the parts that didn't fit. The model runs out of brain.
Altum gives it more. When context overflows, the model calls llm_query()
and a fresh sub-agent spawns with its own workspace. Divide. Delegate. Conquer.
Altum implements true recursive LLM execution. Each depth level is a complete agent with its own Python sandbox. This is a full sub-brain rather than a simple prompt chain or tool call.
At depth:0, the engine spawns a Python REPL sandbox. The model can write code, introspect the context, and decide whether it needs help or can proceed independently.
When the model needs deeper reasoning, it calls llm_query(chunk).
Altum spawns a full child RLM at depth:1 with an isolated sandbox.
The parent waits. The child thinks.
Each child can spawn its own children, down to --max-depth.
At the leaf level, calls hit the LLM directly. Divide. Delegate. Conquer.
Results bubble back up the chain.
The root RLM collects sub-agent results, runs aggregation logic in its sandbox,
and calls FINAL(answer). You get one clean answer. No assembly required.
Orchestrator. Analyzes query, delegates chunks, aggregates results.
Focused analyzer. Processes a context chunk. Can recurse further.
Verifier / extractor. Hits LLM directly. Returns result up the chain.
Not chain-of-thought. Not tool calls. Full sub-agents at every depth level, each with its own sandbox and decision loop. Inspired by divide-and-conquer algorithms applied to reasoning.
rlm · depthCode runs in ouros, which is an isolated Python environment. No filesystem. No network. No surprises. Variables persist across iterations. The model thinks in code; the sandbox keeps it honest.
ouros · sandboxBranch your reasoning like git branches your code. CHECKPOINT_CREATE() snapshots sandbox state. FORK_CREATE() spawns parallel hypothesis paths. STRATEGY_COMMIT() picks the winner. Science, not vibes.
hypothesis · branchingA model-visible scratchpad: /plans/ for strategy, /evidence/ for extracted facts, /artifacts/ for generated work. Think of it as version control for the model's reasoning process.
vfs · artifactsOpenAI, Anthropic, Gemini, xAI, Ollama. One engine to route them all. Auto-detects provider from model name prefix. Swap models without rewriting your workflow.
openai · claude · gemini · grokNative Model Context Protocol server. Use Altum as a tool inside Claude Code, any MCP-compatible agent, or your own agentic workflow. Recursive reasoning as a service.
mcp · claude-code(Confidence intervals do. So we show those too.)
Auto-detect from model name. Zero config switching. Bring your own keys.
Build from source with Cargo. Set your API key. Pick a model. Altum handles recursion, sandboxing, and context management.
Need Claude Code integration? The MCP server is built in. Just configure the tool path and you're orchestrating recursive agents from inside your editor.
# clone and build git clone https://github.com/Diogenesoftoronto/altum cd altum cargo build --release # set your API key export OPENAI_API_KEY="sk-..." # or Anthropic export ANTHROPIC_API_KEY="sk-ant-..." # or Google export GEMINI_API_KEY="AIza..."
# run a one-shot query with recursion depth 2 ./target/release/altum run \ --model claude-sonnet-4 \ --max-depth 2 \ --query "Analyze this document..." # interactive chat mode with thread persistence ./target/release/altum chat \ --model gemini-2.0-flash \ --thread my-project # upload context from file ./target/release/altum upload \ --thread my-project \ --file big-doc.txt
# deep recursion with strict depth enforcement altum run \ --model gpt-4o \ --max-depth 4 \ --depth-enforcement strict \ --require-min-depth 3 \ --query "..." # enable checkpoints, forks, and VFS altum run \ --enable-checkpoints \ --enable-forks \ --enable-vfs \ --max-iterations 20 \ --model claude-sonnet-4 \ --query "..." # verbose mode to see every thought altum run --verbose --model gemini-flash --query "..."
// .claude/settings.json for Claude Code integration { "mcpServers": { "altum": { "command": "/path/to/altum", "args": ["mcp"], "env": { "ANTHROPIC_API_KEY": "sk-ant-..." } } } }