Architecture¶
STELLAR is three layers: a fixed-convention storage layer, a FastAPI
app factory that mounts modules dynamically, and a single compiled
React SPA whose tabs are gated at runtime by /api/config. Each layer
is deliberately small.
flowchart TB
subgraph "User project"
Y["stellar.yaml"]
H["data/raw/*.h5ad"]
E["data/external/{de,hdwgcna,...}"]
end
Y -->|stellar ingest| ING
H --> ING
E --> ING
subgraph "Storage layer (fixed convention)"
ING[ingest orchestrator]
L["data/lance/expression_*.lance<br/>(gene-major)"]
D["data/duckdb/atlas.duckdb<br/>(metadata + module tables)"]
S["data/static/coords_*.arrow<br/>(pre-baked UMAP)"]
P["data/parquet/<module>/*.parquet"]
ING --> L
ING --> D
ING --> S
ING --> P
P --> D
end
L -.read.-> API
D -.read.-> API
S -.read.-> API
subgraph "Serving layer"
API["FastAPI app<br/>core routes + module routers"]
SPA["React SPA<br/>(pre-built bundle, tabs gated by /api/config)"]
API --> SPA
end
SPA -->|stellar deploy| WEB[Static webserver / nginx]
Storage layer — LanceDB + DuckDB¶
Every project gets the same two stores; no project may bypass the convention.
LanceDB — the cell × gene matrix in gene-major layout¶
The cell × gene matrix lives in LanceDB as
one row per gene with sparse (cell_idx, value) columns. The
critical access pattern in a single-cell atlas is "colour the UMAP by
gene X", which is an O(1) row read against a gene-major store instead
of a column slice through millions of cells. The same Lance files hold
IVF-PQ vector indexes for cell- and gene-similarity k-NN queries.
If your project ships a wide matrix alongside the primary matrix
(see input.matrices),
the expression routes fall back to the wider matrix when the user
queries a gene that's missing from the primary panel.
DuckDB — every non-expression table¶
Everything that isn't expression — cells, donors, genes, DE
comparisons + results, hdWGCNA modules, CellChat networks, Milo
neighbourhoods, pseudobulk pre-compute — lives in a single embedded
DuckDB file: data/duckdb/atlas.duckdb.
DuckDB gives us:
- Sub-second columnar SQL aggregates against millions of rows.
- No separate database server to manage, no port to expose.
- A single file you can rsync to the deploy target alongside the Lance directories.
Module ingest writes parquet under data/parquet/<module>/; the
orchestrator then runs a CREATE OR REPLACE TABLE … AS SELECT * FROM
read_parquet(...) per module. The DuckDB file is rebuilt every
ingest (no in-place migrations).
Static coords — pre-baked Arrow¶
data/static/coords_*.arrow holds the UMAP projection as Apache Arrow
IPC, one file per matrix. The frontend pulls the file once at boot —
no round-trip per zoom or pan.
Serving layer — FastAPI app factory + module mounting¶
stellar.backend.app.create_app(config, project_root, modules) is a
factory that returns a fresh FastAPI instance:
app = FastAPI(title=config.project.title, openapi_url=None)
# Always-on core routes (mounted under /api):
# /api/config, /api/describe, /api/genes, /api/embedding/coords,
# /api/embedding/expression, /api/embedding/colorby, …
app.include_router(core_routes.router, prefix="/api")
# For every module enabled in stellar.yaml:
for module in enabled_modules(config):
router = module.routes()
if router is not None:
app.include_router(router, prefix="/api")
# The SPA is mounted at config.project.base_url, with /assets/ for hashed
# bundle output.
_mount_frontend(app)
A module that isn't enabled isn't mounted — no runtime branching, no unused routes in the OpenAPI surface, no import of its dependencies.
The factory is the only place that knows the difference between core and module routes. Anything else operates on the assembled FastAPI instance.
Frontend — one compiled bundle, config-driven nav¶
The SPA ships pre-built in the wheel. There is no per-project
React build at deploy time — stellar build-frontend is needed only
when you want to rebuild against a different base_url or change the
SPA source.
At boot the SPA fetches GET /api/config and reads:
project.title,branding.primary_color,branding.logo— header- theme
modules.<name>.enabled— which tabs to rendertabs[*]— thefrontend_tabs()output from every enabled module, merged + sorted byorder
Modules that aren't enabled contribute no tabs and no nav entries. Adding a new module is therefore a server-side change only — the SPA needs no rebuild as long as the module's tab definition points at an existing frontend route (or a stock module-tab component).
Lifecycle summary¶
| CLI command | What it does |
|---|---|
stellar init <slug> |
Scaffold stellar.yaml + data/raw/ skeleton |
stellar ingest |
Orchestrator runs core ingest + every enabled module's ingest() |
stellar build-frontend |
Vite build of the React SPA (only needed for non-default base_url) |
stellar serve |
Uvicorn on :18901 against the app factory |
stellar deploy |
rsync dist/ + static artefacts to the deploy target |
stellar doctor |
Validate stellar.yaml against the data directory |
The contract is: the same package code serves every project. No
per-project Python forks; no per-project frontend forks; differences
live in stellar.yaml and the contents of data/.