Skip to content

Extending — write your own module

Adding a new analysis type to STELLAR is one folder with one class. The core never imports a module by name; instead every module is a sub-package under stellar/modules/<name>/ that exposes exactly one Module subclass. The orchestrator (for stellar ingest) and the app factory (for stellar serve) iterate the registry and call each enabled module's lifecycle hooks.

This page walks you through adding a fictional SCENIC module (regulon analysis) end-to-end.

The contract

Bases: ABC

Base class for an opt-in STELLAR analysis module.

Subclasses set class-level attributes:

Attributes:

Name Type Description
name str

Slug used in URLs and config keys (e.g. "de").

title str

Human-readable name shown in nav / docs.

extras_key str | None

If set, the optional-deps group that gates this module (matches the key under [project.optional-dependencies] in pyproject.toml).

config_key str

Key under modules in stellar.yaml.

ingest

ingest(ctx: ModuleContext) -> None

Read raw inputs (per :attr:config_key in stellar.yaml) and write parquet under ctx.parquet_dir. Must be idempotent.

duckdb_schema

duckdb_schema() -> str

Return extra SQL to run after parquet load — typically views that join module tables to cells_v. Empty string means no-op.

routes

routes() -> APIRouter | None

Return a FastAPI router or None. Mounted under /api so the router's own prefix (e.g. /de) becomes /api/de/....

claude_tools

claude_tools() -> list[dict[str, Any]]

Anthropic tool schemas exposed when the copilot module is on.

claude_dispatch

claude_dispatch(
    stores: Any = None,
) -> dict[str, Callable[..., Any]]

{tool_name: callable} matching the names in :meth:claude_tools.

Each callable is invoked with the JSON arguments Claude emitted. stores is the live :class:stellar.core.stores.StoreRegistry — close over it to read the DuckDB / Lance backing this atlas. Pass-through stores=None is allowed for modules that don't need atlas state (e.g. Enrichment, which only calls EnrichR).

Example::

def claude_dispatch(self, stores):
    duck = stores.duck
    return {
        "list_things": lambda: duck.query("SELECT …").to_pylist(),
    }

claude_system_prompt

claude_system_prompt() -> str | None

Optional prompt fragment appended to the copilot's system prompt. Keep it concise (3–10 lines) and module-scoped.

frontend_tabs

frontend_tabs() -> list[TabDef]

Tabs to surface in the SPA nav. The frontend reads /api/config and renders only the tabs declared here for enabled modules.

Module context

Read-only build-time context passed to :meth:Module.ingest.

Attributes:

Name Type Description
config StellarConfig

Parsed stellar.yaml.

project_root Path

Directory containing stellar.yaml — paths in the config are resolved against this.

parquet_dir Path

Destination for module parquet output (project_root/data/parquet).

Tab definition

Bases: TypedDict

One entry in the SPA nav. Returned by :meth:Module.frontend_tabs.

Fields

path : str URL path under the SPA base, e.g. /de/conditions. label : str Display text in the nav bar. icon : str Optional emoji / glyph rendered before the label. order : int Sort key — lower numbers appear first. Core tabs sit at 0..9.


Worked example — a SCENIC module

SCENIC infers transcription-factor regulons from single-cell expression data. We'll add a scenic module that:

  • Reads a parquet of regulons + their target genes from modules.scenic.source_dir,
  • Serves GET /api/scenic/regulons and GET /api/scenic/regulon/{id},
  • Adds a Regulons tab to the SPA nav,
  • Surfaces two Claude tools when copilot is on.

1. Directory layout

stellar/modules/scenic/
├── __init__.py            # exposes SCENICModule
├── ingest.py              # parquet → parquet (validate + copy)
├── routes.py              # /scenic/* FastAPI router
└── README.md              # format spec + producing-the-input recipe

2. __init__.py — the Module subclass

"""SCENIC module — regulon viewer."""
from __future__ import annotations

from typing import TYPE_CHECKING, Any

from ...module_api import Module, ModuleContext, TabDef

if TYPE_CHECKING:
    from fastapi import APIRouter


class SCENICModule(Module):
    name        = "scenic"
    title       = "Regulons (SCENIC)"
    extras_key  = "scenic"
    config_key  = "scenic"

    # ------- 1. ingest -----------------------------------------------------
    def ingest(self, ctx: ModuleContext) -> None:
        cfg = ctx.config.modules.get(self.config_key)
        if cfg is None:
            return
        source_dir = getattr(cfg, "source_dir", None)
        if source_dir is None:
            raise ValueError(
                "modules.scenic.source_dir is required; see "
                "stellar/modules/scenic/README.md"
            )
        from .ingest import ingest_scenic

        ingest_scenic(
            source_dir=(ctx.project_root / source_dir).resolve(),
            parquet_dir=ctx.parquet_dir,
        )

    # ------- 2. duckdb_schema ---------------------------------------------
    def duckdb_schema(self) -> str:
        # Optional: a view that joins regulons → cell types via cells_v.
        return ""

    # ------- 3. routes ----------------------------------------------------
    def routes(self) -> APIRouter | None:
        from .routes import router

        return router

    # ------- 4 + 5. copilot tools + dispatch -------------------------------
    def claude_tools(self) -> list[dict[str, Any]]:
        return [
            {
                "name": "list_regulons",
                "description": "List SCENIC regulons, optionally filtered by cell type.",
                "input_schema": {
                    "type": "object",
                    "properties": {"cell_type": {"type": "string"}},
                    "required": [],
                },
            },
            {
                "name": "get_regulon",
                "description": "Return a regulon's metadata + top target genes.",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "regulon_id": {"type": "string"},
                        "top_n":      {"type": "integer", "default": 25},
                    },
                    "required": ["regulon_id"],
                },
            },
        ]

    def claude_dispatch(self):
        from .routes import dispatch_list_regulons, dispatch_get_regulon

        return {
            "list_regulons": dispatch_list_regulons,
            "get_regulon":   dispatch_get_regulon,
        }

    # ------- 6. system prompt fragment ------------------------------------
    def claude_system_prompt(self) -> str | None:
        return (
            "Regulon IDs from SCENIC look like `TF(+)` (e.g. `STAT1(+)`). "
            "Use `list_regulons` to discover them; never guess."
        )

    # ------- 7. frontend tab ----------------------------------------------
    def frontend_tabs(self) -> list[TabDef]:
        return [TabDef(path="/scenic", label="Regulons", icon="🧬", order=50)]


__all__ = ["SCENICModule"]

3. ingest.py — validate + copy parquet

"""SCENIC ingest — strict-validate input parquet, copy into the project store."""
from __future__ import annotations

from pathlib import Path

import pyarrow.parquet as pq

_REQUIRED_COLS = {
    "scenic_regulons.parquet":     {"regulon_id", "tf", "cell_type", "size"},
    "scenic_regulon_genes.parquet": {"regulon_id", "gene", "weight"},
}


def ingest_scenic(*, source_dir: Path, parquet_dir: Path) -> None:
    """Validate the two SCENIC parquet files and copy them under
    ``parquet_dir/scenic/``. Idempotent."""
    out = parquet_dir / "scenic"
    out.mkdir(parents=True, exist_ok=True)

    for fname, required in _REQUIRED_COLS.items():
        src = source_dir / fname
        if not src.exists():
            raise FileNotFoundError(f"SCENIC: missing {src}")
        table = pq.read_table(src)
        missing = required - set(table.schema.names)
        if missing:
            raise ValueError(f"{fname}: missing required columns: {sorted(missing)}")
        # Copy verbatim — DuckDB reads parquet directly.
        pq.write_table(table, out / fname)

4. routes.py — FastAPI router + dispatchers

"""SCENIC routes — /api/scenic/*."""
from __future__ import annotations

from fastapi import APIRouter, HTTPException

from ...backend.util import get_duckdb_conn

router = APIRouter(prefix="/scenic", tags=["scenic"])


@router.get("/regulons")
def list_regulons(cell_type: str | None = None):
    sql = "SELECT regulon_id, tf, cell_type, size FROM scenic_regulons"
    params: list = []
    if cell_type:
        sql += " WHERE cell_type = ?"
        params.append(cell_type)
    sql += " ORDER BY size DESC"
    with get_duckdb_conn() as con:
        rows = con.execute(sql, params).fetch_arrow_table().to_pylist()
    return {"regulons": rows}


@router.get("/regulon/{regulon_id}")
def get_regulon(regulon_id: str, top_n: int = 25):
    with get_duckdb_conn() as con:
        meta = con.execute(
            "SELECT * FROM scenic_regulons WHERE regulon_id = ?", [regulon_id]
        ).fetch_arrow_table().to_pylist()
        if not meta:
            raise HTTPException(status_code=404, detail=f"regulon {regulon_id} not found")
        genes = con.execute(
            "SELECT gene, weight FROM scenic_regulon_genes "
            "WHERE regulon_id = ? ORDER BY weight DESC LIMIT ?",
            [regulon_id, top_n],
        ).fetch_arrow_table().to_pylist()
    return {"regulon": meta[0], "genes": genes}


# --- Claude dispatchers — thin wrappers so the copilot can call them ---

def dispatch_list_regulons(cell_type: str | None = None):
    return list_regulons(cell_type=cell_type)


def dispatch_get_regulon(regulon_id: str, top_n: int = 25):
    return get_regulon(regulon_id=regulon_id, top_n=top_n)

5. Register the module

Add it to stellar/modules/registry.py:

def builtin_modules() -> list[Module]:
    from .cellchat import CellChatModule
    from .copilot import CopilotModule
    from .de import DEModule
    from .enrichment import EnrichmentModule
    from .hdwgcna import HDWGCNAModule
    from .milo import MiloModule
    from .scenic import SCENICModule          # 1. import

    return [
        DEModule(),
        HDWGCNAModule(),
        CellChatModule(),
        MiloModule(),
        EnrichmentModule(),
        SCENICModule(),                       # 2. append (order = nav order hint)
        CopilotModule(),                      # copilot last so it sees other tools
    ]

Third-party modules

You don't have to edit registry.py. Any caller of stellar.core.orchestrate.run_ingest and stellar.backend.app.create_app can pass a custom modules=[...] list — e.g. your own package exposes a SCENICModule() and you write a wrapper CLI. The built-in registry is convenience, not a chokepoint.

6. Add the extra to pyproject.toml

[project.optional-dependencies]
scenic = ["pyscenic>=0.12"]   # or [] if you only consume their parquet output
full   = ["stellar-atlas[de,hdwgcna,cellchat,milo,enrichment,copilot,scenic]"]

7. Write a test

Mirror tests/test_de_module.py — build a tiny synthetic atlas, drop the SCENIC parquet files in, run ingest, hit the routes via TestClient. The DE test is the canonical template:

@pytest.fixture(scope="module")
def project(tmp_path_factory) -> Path:
    root = tmp_path_factory.mktemp("scenic_atlas")
    # ... build a minimal h5ad + scenic_regulons.parquet + scenic_regulon_genes.parquet ...
    config = load_config(root / "stellar.yaml")
    run_ingest(config, project_root=root, modules=builtin_modules(), verbose=False)
    return root


@pytest.fixture(scope="module")
def client(project) -> TestClient:
    config = load_config(project / "stellar.yaml")
    return TestClient(create_app(config, project_root=project, modules=builtin_modules()))


def test_regulons_endpoint(client: TestClient) -> None:
    r = client.get("/api/scenic/regulons")
    assert r.status_code == 200
    assert {"regulons"} <= set(r.json().keys())

8. Document it

Write stellar/modules/scenic/README.md matching the existing module READMEs:

  • Enable snippet (stellar.yaml)
  • Input format (one section per parquet file, with column tables)
  • Producing the input (one recipe per common source)
  • API surface (route table)
  • Copilot tools
  • Frontend tab

Then drop a one-page summary at docs/modules/scenic.md and add it to the nav: block in mkdocs.yml.


Checklist for a new module

  • One folder under stellar/modules/<name>/
  • One Module subclass in __init__.py with name + config_key
    • extras_key
  • ingest.py is strict-validated and idempotent
  • routes.py uses parameterised DuckDB queries (no string-interp SQL)
  • Module appears in stellar/modules/registry.py
  • Module README at stellar/modules/<name>/README.md
  • Tests under tests/test_<name>_module.py
  • pyproject.toml extras updated
  • Docs page at docs/modules/<name>.md, added to mkdocs.yml