FAQ¶
Quick answers to recurring questions. For symptom → fix entries, see Troubleshooting instead.
Why LanceDB + DuckDB instead of just Postgres or parquet?¶
Single-cell access patterns are bimodal. "Recolor the UMAP by gene
X" is a point lookup on a wide table (n_cells columns); doing that
in Postgres or in a Parquet scan means reading every cell for every
recolour. LanceDB stores the matrix gene-major (one row per gene with
sparse cell indices), making each gene lookup O(1) regardless of cell
count. "Top 25 DE genes for cell type T, condition A vs B" is a
classic columnar aggregate — exactly what DuckDB is fastest at, with
zero infrastructure overhead (one file, no server). Pairing them
covers both modes without bolting on a separate cache.
Can I use STELLAR for spatial transcriptomics?¶
Not in v1.0. The whole cells_v view + gene-major matrix layout
assumes dissociated single-cell / single-nucleus RNA-seq. Spatial
adds tissue coordinates, sometimes images, and different query
patterns (neighborhood vs. cell-type). A spatial extension would
likely warrant its own module rather than retrofitting the core. If
you have a spatial atlas you'd like to host, open an issue —
contributions welcome.
Do I need an Anthropic key to use STELLAR?¶
No. The Anthropic key only powers the copilot module. With
modules.copilot.enabled: false (the default in
stellar init scaffolds) STELLAR has no AI surface at all — UMAP,
expression, DE, hdWGCNA, CellChat, Milo, and Enrichment all run
locally without external API calls (Enrichment hits EnrichR; the
others are fully offline).
Is the literature search safe / private?¶
The PubMed lookup hits the public NCBI E-utilities API; the
queries are not authenticated and not associated with your atlas
identity unless you pass an NCBI_API_KEY (which only raises your
rate limit). Configure your scope in stellar.yaml to constrain
what the model can search for. The actual cell-level data never
leaves your DuckDB — only the keyword query (with the
configured scope AND-ed onto it) is sent to NCBI.
Can I bring my own analysis as a module?¶
Yes. Adding a module is one folder under stellar/modules/<name>/
exposing a single Module subclass that overrides the lifecycle
hooks you need (ingest → parquet, FastAPI router, copilot tools,
SPA tab). The worked example in Extending walks
through a SCENIC regulon module end-to-end. Third-party packages
can also pass their own Module instances directly to
run_ingest() / create_app() without editing the built-in
registry.
How big can the atlas get?¶
The reference deployment STELLAR was extracted from is ~3 M cells × ~25 k genes. LanceDB scales further (it's used for billion-row embedding stores in production); DuckDB handles tens of millions of rows in metadata tables on commodity hardware. The constraint in practice is the ingest box's RAM during h5ad → Lance conversion, not the serve-time stores. Plan for ~1.5× the h5ad size in disk space for the built stores.
Why not just use cellxgene?¶
cellxgene is great if all you need is UMAP + expression browsing on a single h5ad. STELLAR's value is the surface area beyond that: pluggable analysis modules (DE, hdWGCNA, CellChat, Milo, Enrichment), a Claude-backed copilot wired to each module's data, a deployable multi-project FastAPI surface, and per-project branding. Use cellxgene if "view one h5ad" is the whole job; use STELLAR if you want a custom deployable atlas with extra analyses bolted on.
Does the copilot send my data to Anthropic?¶
It sends tool calls, not raw data. When the model calls
compare_groups(comparison_id="X") STELLAR runs the SQL on your
DuckDB locally and feeds the result (top-25 gene names, log2fc,
padj — already summarised) back to the model. Cell-level vectors
never leave your machine. The system prompt does include cohort
vocabulary (distinct cell-type and condition labels) so the
model can name what's in this atlas, but that's a small finite
list, not data.
How do I migrate from cellxgene / Cirro / a Seurat lab pipeline?¶
You don't migrate — you point STELLAR at whatever your lab already
produces. If that's an h5ad, run stellar init and edit
stellar.yaml. If it's a Seurat .rds, the
Seurat recipe handles the auto-conversion
on first ingest. Existing analysis outputs (DE results, hdWGCNA
modules, CellChat networks, Milo neighbourhoods) load via small
parquet files documented per-module — no need to recompute.
Can I serve multiple atlases from one box?¶
Yes — give each atlas its own stellar.yaml (with distinct
project.base_url, e.g. /atlas_a/ and /atlas_b/) and run one
stellar serve process per atlas on a different port. nginx
proxies each base_url to the matching uvicorn. The
Deploy recipe shows the systemd unit; copy it once
per project. Each process has its own StoreRegistry, so they
don't share memory.
What does "core" actually do without any modules enabled?¶
Core gives you:
- A UMAP rendered from your h5ad's
obsm["X_umap"](or explicit obs columns). - Per-cell-type coloring + roster (
/api/describe). - Gene search and color-by-gene expression on the UMAP.
- Per-cell-type violin plots for any gene.
- Per-cell-type top-expressed-genes panel.
That alone is enough for a basic exploratory atlas; modules layer DE, networks, communication, abundance, enrichment, and chat on top.
How is configuration validated?¶
stellar.yaml is parsed and validated by Pydantic at startup. Unknown
keys raise — a typo like modues: fails fast with a precise pointer
instead of silently ignored. Module sub-blocks (e.g.
modules.de.source_dir) are open: the core only reads enabled;
everything else is forwarded to the module, which owns its own keys.
Can I change the Claude model used by the copilot?¶
Yes — set modules.copilot.model in stellar.yaml:
This is useful for cost-sensitive deploys where Sonnet's reasoning is enough and Opus's price isn't justified.
How do I keep an atlas working without internet?¶
Disable the modules that need outbound HTTP and run on a closed network:
modules:
enrichment: { enabled: false } # EnrichR
copilot: { enabled: false } # Anthropic + (optionally) PubMed
Everything else (UMAP, expression, DE, hdWGCNA, CellChat, Milo) is fully offline.
Does STELLAR run on Windows?¶
Tested on Linux (Ubuntu 22.04+) and macOS (Apple Silicon). Windows
isn't a supported target — the deploy recipes assume systemd / nginx
and the Seurat path expects a Unix Rscript. WSL2 works.