Skip to content

FAQ

Quick answers to recurring questions. For symptom → fix entries, see Troubleshooting instead.

Why LanceDB + DuckDB instead of just Postgres or parquet?

Single-cell access patterns are bimodal. "Recolor the UMAP by gene X" is a point lookup on a wide table (n_cells columns); doing that in Postgres or in a Parquet scan means reading every cell for every recolour. LanceDB stores the matrix gene-major (one row per gene with sparse cell indices), making each gene lookup O(1) regardless of cell count. "Top 25 DE genes for cell type T, condition A vs B" is a classic columnar aggregate — exactly what DuckDB is fastest at, with zero infrastructure overhead (one file, no server). Pairing them covers both modes without bolting on a separate cache.

Can I use STELLAR for spatial transcriptomics?

Not in v1.0. The whole cells_v view + gene-major matrix layout assumes dissociated single-cell / single-nucleus RNA-seq. Spatial adds tissue coordinates, sometimes images, and different query patterns (neighborhood vs. cell-type). A spatial extension would likely warrant its own module rather than retrofitting the core. If you have a spatial atlas you'd like to host, open an issue — contributions welcome.

Do I need an Anthropic key to use STELLAR?

No. The Anthropic key only powers the copilot module. With modules.copilot.enabled: false (the default in stellar init scaffolds) STELLAR has no AI surface at all — UMAP, expression, DE, hdWGCNA, CellChat, Milo, and Enrichment all run locally without external API calls (Enrichment hits EnrichR; the others are fully offline).

Is the literature search safe / private?

The PubMed lookup hits the public NCBI E-utilities API; the queries are not authenticated and not associated with your atlas identity unless you pass an NCBI_API_KEY (which only raises your rate limit). Configure your scope in stellar.yaml to constrain what the model can search for. The actual cell-level data never leaves your DuckDB — only the keyword query (with the configured scope AND-ed onto it) is sent to NCBI.

Can I bring my own analysis as a module?

Yes. Adding a module is one folder under stellar/modules/<name>/ exposing a single Module subclass that overrides the lifecycle hooks you need (ingest → parquet, FastAPI router, copilot tools, SPA tab). The worked example in Extending walks through a SCENIC regulon module end-to-end. Third-party packages can also pass their own Module instances directly to run_ingest() / create_app() without editing the built-in registry.

How big can the atlas get?

The reference deployment STELLAR was extracted from is ~3 M cells × ~25 k genes. LanceDB scales further (it's used for billion-row embedding stores in production); DuckDB handles tens of millions of rows in metadata tables on commodity hardware. The constraint in practice is the ingest box's RAM during h5ad → Lance conversion, not the serve-time stores. Plan for ~1.5× the h5ad size in disk space for the built stores.

Why not just use cellxgene?

cellxgene is great if all you need is UMAP + expression browsing on a single h5ad. STELLAR's value is the surface area beyond that: pluggable analysis modules (DE, hdWGCNA, CellChat, Milo, Enrichment), a Claude-backed copilot wired to each module's data, a deployable multi-project FastAPI surface, and per-project branding. Use cellxgene if "view one h5ad" is the whole job; use STELLAR if you want a custom deployable atlas with extra analyses bolted on.

Does the copilot send my data to Anthropic?

It sends tool calls, not raw data. When the model calls compare_groups(comparison_id="X") STELLAR runs the SQL on your DuckDB locally and feeds the result (top-25 gene names, log2fc, padj — already summarised) back to the model. Cell-level vectors never leave your machine. The system prompt does include cohort vocabulary (distinct cell-type and condition labels) so the model can name what's in this atlas, but that's a small finite list, not data.

How do I migrate from cellxgene / Cirro / a Seurat lab pipeline?

You don't migrate — you point STELLAR at whatever your lab already produces. If that's an h5ad, run stellar init and edit stellar.yaml. If it's a Seurat .rds, the Seurat recipe handles the auto-conversion on first ingest. Existing analysis outputs (DE results, hdWGCNA modules, CellChat networks, Milo neighbourhoods) load via small parquet files documented per-module — no need to recompute.

Can I serve multiple atlases from one box?

Yes — give each atlas its own stellar.yaml (with distinct project.base_url, e.g. /atlas_a/ and /atlas_b/) and run one stellar serve process per atlas on a different port. nginx proxies each base_url to the matching uvicorn. The Deploy recipe shows the systemd unit; copy it once per project. Each process has its own StoreRegistry, so they don't share memory.

What does "core" actually do without any modules enabled?

Core gives you:

  • A UMAP rendered from your h5ad's obsm["X_umap"] (or explicit obs columns).
  • Per-cell-type coloring + roster (/api/describe).
  • Gene search and color-by-gene expression on the UMAP.
  • Per-cell-type violin plots for any gene.
  • Per-cell-type top-expressed-genes panel.

That alone is enough for a basic exploratory atlas; modules layer DE, networks, communication, abundance, enrichment, and chat on top.

How is configuration validated?

stellar.yaml is parsed and validated by Pydantic at startup. Unknown keys raise — a typo like modues: fails fast with a precise pointer instead of silently ignored. Module sub-blocks (e.g. modules.de.source_dir) are open: the core only reads enabled; everything else is forwarded to the module, which owns its own keys.

Can I change the Claude model used by the copilot?

Yes — set modules.copilot.model in stellar.yaml:

modules:
  copilot:
    enabled: true
    model: claude-sonnet-4-7   # default is claude-opus-4-7

This is useful for cost-sensitive deploys where Sonnet's reasoning is enough and Opus's price isn't justified.

How do I keep an atlas working without internet?

Disable the modules that need outbound HTTP and run on a closed network:

modules:
  enrichment: { enabled: false }   # EnrichR
  copilot:    { enabled: false }   # Anthropic + (optionally) PubMed

Everything else (UMAP, expression, DE, hdWGCNA, CellChat, Milo) is fully offline.

Does STELLAR run on Windows?

Tested on Linux (Ubuntu 22.04+) and macOS (Apple Silicon). Windows isn't a supported target — the deploy recipes assume systemd / nginx and the Seurat path expects a Unix Rscript. WSL2 works.