Skip to content

Enrichment — live pathway enrichment

Live pathway enrichment for arbitrary gene lists, powered by the public EnrichR REST API. Surfaces an Enrichment tab in the SPA and a POST /api/enrichment route the copilot calls as a Claude tool when both modules are on.

extras_key enrichment
config_key enrichment
install pip install 'stellar-atlas[enrichment]'
frontend tab Enrichment

Enable

modules:
  enrichment:
    enabled: true

That's it — there is no ingest step and no source data. The module makes outbound HTTP calls to EnrichR at request time, so the host running stellar serve needs internet access to maayanlab.cloud.

Libraries

EnrichR ships >200 gene-set libraries (GO, KEGG, MSigDB, Reactome, ChEA, CellMarker, …). The full list is queryable at https://maayanlab.cloud/Enrichr/#libraries. The frontend defaults three libraries on:

  • GO_Biological_Process_2023
  • KEGG_2021_Human
  • MSigDB_Hallmark_2020

You can pass any library name in the request body's libraries array to override.

API surface

When enabled:

POST /api/enrichment
content-type: application/json

{
  "genes":       ["APOE", "TREM2", "CD33"],
  "libraries":   ["GO_Biological_Process_2023", "KEGG_2021_Human"],
  "top_n":       20,
  "description": "stellar-enrichment"
}

Returns Arrow IPC of (library, term, pval, padj, odds_ratio, combined_score, overlap, genes_in_term), sorted by EnrichR's rank (combined-score order).

Example calls

Hit EnrichR with a small gene list against the default three libraries:

curl -sX POST http://localhost:18901/api/enrichment \
     -H 'content-type: application/json' \
     -d '{"genes": ["CD8A", "GZMB", "PRF1", "IFNG", "TBX21"]}' \
     -o /tmp/enrich.arrow

Override libraries (any of EnrichR's ~200+ are valid):

curl -sX POST http://localhost:18901/api/enrichment \
     -H 'content-type: application/json' \
     -d '{"genes":     ["CD8A","GZMB","PRF1","IFNG","TBX21"],
          "libraries": ["MSigDB_Hallmark_2020", "Reactome_2022"],
          "top_n":     10}' \
     -o /tmp/enrich.arrow

Read the result in Python:

import pyarrow.ipc as ipc
with open("/tmp/enrich.arrow", "rb") as f:
    table = ipc.open_stream(f).read_all()
print(table.to_pandas().head())

EnrichR is third-party

On any HTTP / network failure talking to EnrichR the route returns HTTP 503 with a clear error message — the SPA renders it verbatim in an inline banner. Do not paste anything sensitive into the gene list; EnrichR logs userListIds server-side. See Troubleshooting → 503 from /api/enrichment for what to do when it fires.

Caching & rate limits

EnrichR is a public, IP-rate-limited service. The module keeps an in-process LRU cache (1 h TTL, 64 entries) keyed by the sorted gene list + library set, so repeated clicks of Run on the same input do not re-hit the API. Restarting the process clears the cache.

If you expect heavy traffic, put nginx (or your CDN) in front of /api/enrichment with its own cache layer — the response is deterministic for a given (genes, libraries) tuple.

Frontend tab

An Enrichment tab appears in the SPA nav when this module is enabled. UI flow:

  1. Paste a gene list (one per line or comma-separated) into the left rail.
  2. Tick libraries; click Run.
  3. The main pane renders a per-library accordion, each with a horizontal barplot of the top-15 terms by -log10(padj) (coloured by combined score) plus a sortable table.
  4. Download CSV on each section exports that library's full table.

Copilot tool

When both enrichment and copilot are enabled the module contributes one tool to the Claude agent loop:

enrich_genes

{
  "name": "enrich_genes",
  "description": "Run pathway enrichment on a gene list via the public EnrichR API and return the top terms across one or more libraries.",
  "input_schema": {
    "type": "object",
    "properties": {
      "genes":     {"type": "array", "items": {"type": "string"},
                    "description": "Gene symbols to test (uppercase HGNC)."},
      "libraries": {"type": "array", "items": {"type": "string"},
                    "description": "EnrichR library names. Defaults to GO_Biological_Process_2023, KEGG_2021_Human, MSigDB_Hallmark_2020."},
      "top_n":     {"type": "integer", "default": 20,
                    "description": "Top terms to return per library."}
    },
    "required": ["genes"]
  }
}

System prompt fragment

enrich_genes runs live EnrichR on a gene list and returns the top terms across user-chosen libraries. Use it after a DE pull to turn a top-gene set into pathways. Default libraries cover GO BP, KEGG, MSigDB Hallmark.

This is what lets the user ask the chat "What's enriched among the top up-regulated genes for X vs Y?" — the copilot pulls the DE hits via compare_groups, then chains into enrich_genes without leaving the conversation. Implementation at stellar/modules/enrichment/__init__.py — see Extending for the pattern.

Constraints

The route hits a third-party public API. For fully-offline atlases, disable this module.