v2.0 shipping · 599 tests · do-calculus + plugin SDK + custom KB live

Glass-box quantitative
intelligence.

The verification cortex for serious quantitative work — for humans analyzing hard data, and for AI systems that can't afford to hallucinate. Local. Open. Cited.

// cloud LLMs guess. Aurora computes.

Support on Patreon

// drag to rotate · click face
// install

Run Aurora locally in 60 seconds.

git clone https://github.com/FantasyLab-ai/aurora.git
cd aurora
python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# Optional substrate-layer extras
pip install cryptography           # Ed25519 bundle signing
pip install mcp                    # MCP server for LLM agents

# Run the Studio
python studio_api.py

Open http://127.0.0.1:8000 → click ▶ Try a demo → 10-second smoke test, or drop your own CSV / Parquet / JSON / XLSX.

Inline comments cover Windows activation + optional substrate-layer extras.

599
Tests Passing
SDK · MCP · Contracts · Causal · Plugin · core
24+
Research-Grade Methods
SINDy · HMM · Granger · do-calculus · VAR · DTW · BOCPD · more
6
Analytical Lenses
per dataset
0
Fabricated
contractual · audited live
// one engine, two surfaces

Two ways to call Aurora.

Same code. Same glass-box principles. Two integration shapes — one for humans clicking through findings, one for AI systems calling APIs.

🧠 Copilot
for humans

Drop a dataset.
Read what Aurora found.

A local quantitative copilot for the work that matters too much to trust to a model that hallucinates. Six analytical lenses, 24+ research-grade methods, knowledge-grounded synthesis, every "What This Means" sentence cited to a seed:* entry.

  • Glass-box Studio with Overview, Anomalies, Regimes, Motifs, Forecast, Physics
  • 24+ methods — Isolation Forest, Hampel, Granger, HMM, persistent homology, SINDy, GP, mutual info, Pearl do-calculus, VAR, DTW, BOCPD, Robust PCA, Kalman, EMD, more
  • Multi-dataset joins · composable findings · runs library (pin, A/B compare, share)
  • Custom KB ingestion — drop PDFs/MD/TXT, get a citable knowledge bank
  • Honest disclosure — sampling, timeouts, skips surfaced (never silently faked)
See it in action →
🛡️ Cortex
for AI systems

Give your AI the ability
to cite real math.

Every LLM today invents numbers. Aurora is the structurally different fix — it computes and verifies rather than predicts. Wire it into Claude Desktop, Claude Code, Cursor, or any MCP-compatible agent in 5 minutes. Or call it directly from Python.

  • Aurora MCP server — 7 tools, HTTP transport, path-allowlisted, output-capped, JSON-only
  • Aurora SDKpip install away from cited quantitative reasoning in any script or notebook · Jupyter HTML repr
  • Decision Contracts — 6 action types: webhook · Slack · Discord · email · log · file · SSRF-guarded
  • Streaming — file-watcher + SSE bus · Kafka + Postgres CDC connectors
  • BYO-LLM — Anthropic, OpenAI, Gemini, Ollama, OpenAI-compatible · GPU embeddings (CUDA / MPS / auto)
  • Plugin SDK — third-party methods register via aurora_plugins entry-point, isolated failures
  • Aurora Bundle Format v1 — portable, SHA-256-integrity, Ed25519-signable, signer-registry-attested .aurora.json artifacts
Read the SDK + MCP docs →
audit_pipeline.py
import aurora_sdk as aurora

r = aurora.run("data.csv", depth="standard")
r.findings.critical().by_method("iso-forest")
r.forecast.peak(horizon_hours=24)
r.bundle.save("audit.aurora.json")   # SHA-256 + optional Ed25519 signing

# Verify on any machine with Aurora installed
b = aurora.Bundle.load("audit.aurora.json")
b.verify()                            # raises if tampered

The same .aurora.json bundle moves between the SDK, the MCP server, Decision Contracts, and the Studio. One format. One verifiable artifact. Four programmable surfaces.

// wire it up

Plug Aurora into your stack.

You already have Aurora running. Now connect it to the agent or pipeline that calls it. Two copy-paste flows, both finished in 5 minutes.

🛡️ Aurora MCP Server
for Claude Desktop · Claude Code · Cursor · any MCP-compatible agent

Install the MCP extras, then drop the config into your agent. Claude Desktop reads claude_desktop_config.json on startup; Claude Code and Cursor expose MCP settings in their UI. Aurora's MCP server exposes 7 path-allowlisted tools.

1 · Install MCP extras
pip install -r requirements-mcp.txt
# or, equivalently:
pip install mcp cryptography
2 · Add to your agent config
{
  "mcpServers": {
    "aurora": {
      "command": "python",
      "args": ["-m", "aurora_mcp"],
      "cwd": "/absolute/path/to/aurora"
    }
  }
}

Restart your agent. Aurora's tools appear under @aurora. Try: "@aurora analyze this CSV and give me the top anomalies."

🐍 Aurora Python SDK
for notebooks · scripts · pipelines · your own AI products

One import away from cited quantitative reasoning. The SDK speaks the same .aurora.json bundle format as the MCP server and Decision Contracts — same artifact, same guarantees, just a different surface.

1 · Install the SDK
pip install -e ./aurora_sdk
# editable install — points at the cloned repo
2 · Run it
import aurora_sdk as aurora

r = aurora.run("data.csv", depth="standard")
r.findings.critical().by_method("iso-forest")
r.forecast.peak(horizon_hours=24)

# Save a signed, portable bundle
r.bundle.save("audit.aurora.json")

# Verify on any machine with Aurora installed
b = aurora.Bundle.load("audit.aurora.json")
b.verify()  # raises if tampered

Full API: SDK docs on GitHub →

Decision Contracts? Same install path — predicates live in YAML, the engine ships with Aurora. See Decision Contracts docs →

// continuous learning
Aurora learns while you sleep

Smarter every
night.

Every run feeds back into Aurora's prior library. Confirmed patterns increase confidence. Contradictions surface. Across thousands of runs, Aurora learns which signatures repeat.

Overnight learning mode connects to public data streams — FRED economic releases, NOAA climate observations, NIST reference updates, peer-reviewed literature feeds — and ingests new structured knowledge into the bank. You wake up to a smarter Aurora than you went to bed with.

Public Streams
FRED · NOAA · NIST · IPCC · arXiv · Wikidata
Cadence
Nightly · per-run · transparent changelog
Knowledge Bank Target
2–3M entries · all public-licensed
FRED NOAA NIST arXiv // LEARNING
// what it actually does

Six lenses. One dataset.

Drop a CSV, Parquet, or JSON file. Aurora analyzes through six purpose-built lenses — anomalies, regimes, motifs, forecast, physics, and overview. Every finding cited. Every method inspectable. No black boxes.

Lens · 01

Overview grounded

Classified shape, learned-from-past-runs priors, advanced methods at a glance — SINDy governing equations, HMM latent regimes, mutual info, Granger causality, wavelet, Lomb-Scargle, persistent homology, multivariate outliers. Research-grade methods, surfaced in plain English.

// methods: 8+ · confidence: explicit · output: substantive
Aurora Overview lens screenshot [ overview · classified shape · advanced methods ]
Lens · 02

Anomalies critical

Multivariate outlier consensus across mahalanobis-robust, isolation forest, and LOF — flagged by ≥2-of-3 detectors, never one. Z-scores, predictive-maintenance precursors, AR(1) forecasts of when the next breach hits. Each finding pinned to a peer-reviewed reference.

// detectors: 3 consensus · false-positive defense: built-in
Aurora Anomalies lens screenshot [ anomalies · top findings · seed citations ]
Lens · 03

Regimes latent

HMM Baum-Welch decoded latent states, mean shifts, expected dwell times, and posterior probabilities. PELT change-point detection on top. Know when your system actually changed — not just when a metric moved.

// hmm: baum-welch · changepoints: PELT · posterior: full
Aurora Regimes lens screenshot [ regimes · latent states · dwell times ]
Lens · 04

Motifs topological

Persistent homology surfaces structural patterns invisible to traditional statistics. Bootstrap-validated cluster stability with silhouette scores. The shape of your data, made legible.

// vietoris-rips · betti numbers · k-means stability
Aurora Motifs lens screenshot [ motifs · topology · cluster structure ]
Lens · 05

Forecast honest

Multiple forecasters compete on a held-out fold — AR(1), kNN-window, exponential smoothing — with calibrated CRPS scores. The winner is selected on out-of-sample performance, not in-sample fit. Threshold breach probabilities and confidence intervals reported transparently.

// holdout: 24 samples · calibration: CRPS · winner: stated
Aurora Forecast lens screenshot [ forecast · peak prediction · alternates ]
Lens · 06

Physics discovered

SINDy fits sparse governing equations to your data. Aurora cross-references the discovered ODE against known physical laws — exponential growth/decay, damped harmonic oscillator, logistic — and reports matches with RMSE and AIC. Real physics discovery. Not metaphor.

// sindy: pareto · law match: 100% reported · candidates: 3+ forms
Aurora Physics lens screenshot [ physics · governing equation · matched law ]
// capabilities

Built like a quantitative brain.

Every layer engineered to produce defensible findings — not impressive-looking ones. Math first, narrative second.

📐

Deep Math, Real Physics

SINDy governing equations, HMM latent regimes, mutual info, Granger causality, wavelets, Lomb-Scargle, persistent homology. The methods quants and physicists actually use — not LLM party tricks.

📚

2–3M Knowledge Entries

Curated from public, licensed sources — peer-reviewed papers, FRED metadata, NOAA, NIST, IPCC, ontologies, Wikidata. Every claim Aurora makes traces back to an inspectable entry. Growing nightly.

🔍

Glass-Box By Design

Every finding has a method tag. Every method has source code. Every claim has a citation. Every confidence number is calibrated. Click any sentence to see where it came from.

💻

100% Local Execution

Your CPU. Your data. Your runs. Zero cloud dependencies. No API costs. No data egress. No subscription required to analyze. No telemetry, no phone-home, no analytics. The entire pipeline runs on your machine, offline.

⚖️

Hallucination Defense

The local LLM only rewords retrieved knowledge — it never invents. Every claim is verified against retrieved entries; ungrounded statements are flagged. The same verifier protects MCP and SDK callers — AI agents calling Aurora get the same glass-box guarantees.

🌙

Overnight Learning

Aurora connects to public data streams nightly — ingesting new peer-reviewed papers, economic releases, climate observations, reference updates. Every run also strengthens internal priors. You wake up to a smarter system.

🔌

MCP + SDK + Decision Contracts

Four programmable surfaces consuming the same .aurora.json bundle format. Wire Aurora into any LLM agent, notebook, pipeline, or automation in minutes. Path-allowlisted, output-capped, SSRF-guarded.

🔐

Signed, Verifiable Bundles

Every Aurora run produces a portable .aurora.json artifact with SHA-256 content hash and optional Ed25519 signing. Move it between machines, attach it to audits, ship it with research papers. Tampering raises on verification.

// how the intelligence works

Inside Aurora.

Real engineering, not magic. Seven subsystems make up the analytical brain — six driving the analysis, one exposing it to AI agents and pipelines. Each inspectable, each debuggable, each running locally.

subsystem · 01

Quantitative Engine

Runs every dataset through 24+ research-grade methods in parallel — SINDy, HMM Baum-Welch, mutual info, Granger, wavelet CWT, Lomb-Scargle, persistent homology, Pearl do-calculus, VAR, DTW, BOCPD, Robust PCA, Kalman, EMD, spectral entropy, robust z-score, isolation forest, LOF, mahalanobis, PELT, AR(1), kNN-window, exponential smoothing, GP. Each method outputs structured findings with explicit confidence.

methods: 24+ · runtime: deterministic · output: structured findings
subsystem · 02

Knowledge Bank

Targeting 2–3M structured entries from public, licensed sources. Peer-reviewed papers (Chandola, Pearl, Hyndman/Athanasopoulos, Brunton/Proctor/Kutz, Malthus, Hampel, Newton), reference databases (FRED, NOAA, NIST, IPCC), ontologies, and Wikidata. Every entry inspectable, version-tracked, and linkable.

target: 2–3M · sources: peer-reviewed + reference DBs · license: public
subsystem · 03

Synthesis Engine (RAG)

A local LLM (Gemma 3 12B via Ollama) takes the structured findings and retrieved knowledge entries — and writes the human-readable "what this means" narrative. Strict prompts: use only the retrieved facts. Never invent. Every sentence carries a seed-citation tag back to its source entry.

model: gemma3:12b · runtime: ollama · pattern: strict RAG
subsystem · 04

Verification Layer

After synthesis, a post-hoc verifier checks every claim in the narrative against the retrieved knowledge entries. Anything that doesn't trace back gets flagged or rewritten. This is what makes Aurora glass-box even with an LLM in the loop.

checks: per-claim trace · flag: ungrounded statements
subsystem · 05

Spacetime Graph

Builds a system graph for every run — nodes for variables and processes, edges for discovered relationships, worldlines projected forward and backward in time. Threshold-cross events surface where the math says something is about to break. Scrub the timeline, simulate counterfactuals.

graph: per-dataset · projection: ±6h · simulation: counterfactual
subsystem · 06

Continuous Learner

Every run feeds back into Aurora's prior library. Confirmed patterns increase confidence. Contradictions surface. Overnight learning mode connects to public data streams (FRED, NOAA, NIST, arXiv, Wikidata) and ingests new structured knowledge. Aurora learns. Reproducibly. Transparently.

mode: nightly · streams: 5+ · changelog: transparent
subsystem · 07

Programmable Surfaces

MCP server (7 tools), Python SDK, Decision Contracts engine, and the Aurora Bundle Format. The same engine that drives the Studio answers LLM agents, runs in notebooks, and fires automation predicates. Four shapes, one core, one glass-box.

surfaces: 4 · clients: Claude Desktop, Claude Code, Cursor, custom · format: .aurora.json
// what grounding looks like

Every claim. Every citation.

Real output from a real Aurora run on factory bearing sensor data. Each tagged citation links to a specific entry in the knowledge bank. Click. Verify. Trust.

⚙ Knowledge-Grounded · 12 entries cited

The most significant finding is the frequent detection of anomalies related to motor temperature, evidenced by alerts at rows 993, 971, and in the last few rows seed:predictive_maintenance. These anomalies, characterized by a high z-score (+13.6 to +15.1), alongside high vibration and shifted timestamps, suggest a potential developing issue that requires attention seed:predictive_maintenance. These deviations from normal operating conditions could indicate progression towards equipment failure, necessitating intervention to prevent downtime seed:predictive_maintenance. The correlation of motor temperature with vibration and timestamp anomalies is notable, though the exact causal relationship requires further investigation seed:robust_zscore.

Several other anomalies are also present, including elevated vibration at row 220 and vibration coupled with timestamp and rpm anomalies seed:robust_zscore. A forecast peak is anticipated, potentially indicating increased load or stress on the system seed:ar1_persistence. The system is currently operating in a "HIGH" regime, where the mean has shifted to 90 seed:ar1_persistence. There's evidence of a causal effect where vibration increases are associated with changes in timestamps seed:exponential_decay.

Chandola, Banerjee, Kumar (2009)
Anomaly detection: A survey. ACM Computing Surveys.
Brunton, Proctor, Kutz (2016)
Discovering governing equations from data — SINDy. PNAS.
Hyndman, Athanasopoulos (2021)
Forecasting: Principles and Practice (3rd ed). OTexts.
// pipeline

From dataset to defensible insight.

One pass. Six lenses. Cited findings. Inspectable math. The entire pipeline runs on your machine, end-to-end.

01 · Ingest

Drop the data.

CSV, Parquet, or JSON. Aurora detects schema, time axis, gaps, and dupes automatically.

02 · Analyze

Run the math.

24+ research-grade methods run in parallel — anomalies, regimes, motifs, forecast, physics, structure, causal.

03 · Ground

Cite the sources.

RAG retrieves matching entries from the knowledge bank. Local Gemma writes a grounded narrative.

04 · Verify

Show your work.

Every claim traced to a source. Findings ranked. Spacetime graph rendered. Done.

// differentiation

Not another BI dashboard.

Most quantitative AI tools fail in the same three ways — they hallucinate, they can't show their work, and they can't be called by other AI systems. Aurora fixes all three.

LLM Analytics BI Dashboards AutoML Tools Aurora
Hallucination resistant No N/A Mixed Yes (RAG + verifier)
Cited claims No No No Every claim
Real physics / ODE discovery No No No SINDy
Continuous learning No No No Nightly · streams
Runs locally No No Mixed Yes
Glass-box methods No Partial No Full
MCP / AI-callable No No No 7 tools, path-allowlisted
Signed, portable artifacts No No No .aurora.json + Ed25519
Usage limits Tokens Seats Compute Unlimited
Source available No No No Apache 2.0
// built in public

Live status.

Aurora is being built right now, in real time, on real GitHub. These tiles update from the public API.

GitHub stars
github.com/FantasyLab-ai/aurora
Latest commit
checking…
Knowledge entries
52K
growing nightly · target 2–3M
v2.0 shipping
599
tests passing · public source
// roadmap

Where we're going.

Honest about timelines being aspirational. Quality compounds — we ship Now and Next before getting clever.

Now
v2.0 · shipping

The verification cortex.

  • Six analytical lenses + spacetime system graph + phase-space
  • 24+ research-grade methods incl. Pearl do-calculus
  • Multi-dataset joins · composable findings · runs library
  • Custom KB ingestion (PDF/MD/TXT → bank)
  • Bundle attestation (SHA-256 + Ed25519 + signer registry)
  • Plugin SDK (aurora_plugins entry-point)
  • Streaming connectors (file-watcher, Kafka, Postgres CDC)
  • BYO-LLM (Anthropic, OpenAI, Gemini, Ollama, OpenAI-compatible)
  • GPU embeddings (CUDA / MPS / auto · CPU fallback)
  • MCP HTTP transport · Jupyter integration · Research Kit (DOI-mintable)
  • 599 tests passing
v2.1 · ~4–8 weeks

Polish + reach.

  • Mobile / tablet responsive Studio
  • KB marketplace packs hosted (Climate, Finance, Biomed, Industrial)
  • Expanded streaming sinks + connector library
  • Cloud deployment hardening (multi-tenant SLOs, audit log export)
Soon
v2.x · 3–6 months

Domain depth + enterprise.

  • Customer-hosted enterprise deployment + SSO
  • Federated knowledge contribution
  • Domain-specialist sub-agents (per KB pack)
  • Signed-bundle attestation service (registry hub)
Future
v3.0+ · 6–12 months

The platform play.

  • Plugin marketplace (third-party methods, creator revenue share)
  • Knowledge bank marketplace with curator revenue share
  • Real-time collaborative runs (multiplayer Studio)
  • Causal sandbox UI — do-calculus by drag-drop
// timelines aspirational · dependent on traction · feedback shapes priority

Aurora is live. Run it now.

Install in 60 seconds. 599 tests passing. v2.0 shipping with causal inference, plugin SDK, custom KB, and streaming connectors. Real users running it on real data.

📺 Watch the build on YouTube · 💜 Back on Patreon · 🐦 Daily progress on X