Skip to content

hdWGCNA — co-expression modules

Serves precomputed hdWGCNA / WGCNA co-expression modules as a Network tab: cell-type + module pickers, a sortable hub-gene table, and a simple radial network the frontend renders as SVG. Module-trait differential expression (DME) is optional; if you have it, it's exposed under /api/wgcna/dme.

extras_key hdwgcna
config_key hdwgcna
install pip install 'stellar-atlas[hdwgcna]'
frontend tab Network

Enable

modules:
  hdwgcna:
    enabled: true
    source_dir: data/external/hdwgcna   # relative to project root

Input format

Three required parquet files plus one optional, all under source_dir/.

wgcna_modules.parquet — one row per co-expression module

column type required notes
module_id string yes Unique handle; commonly the WGCNA color (e.g. M-blue).
cell_type string yes Cell type the module was discovered in.
color string yes Hex or CSS-named color for plotting.
size int64 yes Number of member genes.
kme_threshold float64 no kME cutoff used during construction (informational).

wgcna_genes.parquet — long table of (module × gene)

column type required
module_id string yes
gene string yes
kme float32 yes
is_hub bool yes

wgcna_hub_genes.parquet — pre-ranked top hub genes

Materialised separately so the API can serve the table without an ORDER BY across the full long table.

column type required
module_id string yes
gene string yes
hub_rank int64 yes
kme float32 yes

wgcna_dme.parquet — (optional) module-trait DE

column type required
module_id string yes
comparison_id string yes
term string yes
log2fc float32 yes
padj float64 yes

Extras are ignored

STELLAR enforces the columns above and ignores extras (meta_kme, correlation, pval, …). They round-trip through the parquet but are not surfaced by the API.

Producing the input

hdWGCNA produces a Seurat-attached object — STELLAR doesn't shell out to R, so the conversion lives upstream of ingest.

library(hdWGCNA)
library(arrow)
library(dplyr)

# 1. wgcna_modules.parquet
mods <- GetModules(seurat_obj) |>
  dplyr::distinct(module, color) |>
  dplyr::rename(module_id = module) |>
  dplyr::mutate(cell_type = "INH") |>            # whatever cell type you ran on
  dplyr::group_by(module_id) |>
  dplyr::mutate(size = n()) |>
  dplyr::ungroup()
arrow::write_parquet(mods, "wgcna_modules.parquet")

# 2. wgcna_genes.parquet
genes <- GetModules(seurat_obj) |>
  dplyr::transmute(
    module_id = module,
    gene      = gene_name,
    kme       = kME,
    is_hub    = kME >= 0.2,
  )
arrow::write_parquet(genes, "wgcna_genes.parquet")

# 3. wgcna_hub_genes.parquet
hubs <- GetHubGenes(seurat_obj, n_hubs = 25) |>
  dplyr::group_by(module) |>
  dplyr::mutate(hub_rank = row_number()) |>
  dplyr::ungroup() |>
  dplyr::transmute(module_id = module, gene = gene_name, hub_rank, kme = kME)
arrow::write_parquet(hubs, "wgcna_hub_genes.parquet")

# 4. wgcna_dme.parquet  (optional — only if you ran FindDMEs)
# dmes <- FindAllDMEs(seurat_obj, group.by = "condition") |> ...
# arrow::write_parquet(dmes, "wgcna_dme.parquet")

Any pure-Python WGCNA replacement works — as long as the output ends up in the four parquet files above with the required columns, STELLAR doesn't care which tool produced them.

import pandas as pd
# module_df, gene_df, hub_df come from your favourite WGCNA replacement.
module_df.to_parquet("wgcna_modules.parquet")
gene_df.to_parquet("wgcna_genes.parquet")
hub_df.to_parquet("wgcna_hub_genes.parquet")

API surface

When enabled:

route what
GET /api/wgcna/modules?cell_type= list modules, filterable
GET /api/wgcna/module/{module_id} module metadata + top N hub genes
GET /api/wgcna/module/{module_id}/network flat radial network spec (nodes + edges)
GET /api/wgcna/dme?module_id= DME rows; 404 if wgcna_dme absent

All routes use parameterised DuckDB queries; path / query strings are never interpolated as SQL.

Example calls

List modules for one cell type:

curl -s 'http://localhost:18901/api/wgcna/modules?cell_type=T' | python -m json.tool
# {"modules": [{"module_id": "T:turquoise",
#               "cell_type": "T",
#               "color":     "turquoise",
#               "size":      147,
#               "kme_threshold": 0.2}, ...]}

Detail + top hub genes for one module:

curl -s 'http://localhost:18901/api/wgcna/module/T:turquoise?top_n=10' \
     | python -m json.tool
# {"module": {"module_id": "T:turquoise", ...},
#  "hub_genes": [{"gene": "GZMB", "kme": 0.92, "hub_rank": 1}, ...]}

Radial network spec (nodes + edges) for the rendering layer:

curl -s 'http://localhost:18901/api/wgcna/module/T:turquoise/network' \
     | python -m json.tool
# {"nodes": [{"id": "GZMB", "kme": 0.92}, ...],
#  "edges": [{"source": "T:turquoise", "target": "GZMB"}, ...]}

Copilot tools

When both hdwgcna and copilot are enabled the module contributes two tools to the Claude agent loop.

list_modules

{
  "name": "list_modules",
  "description": "List hdWGCNA co-expression modules, optionally filtered by cell type.",
  "input_schema": {
    "type": "object",
    "properties": {"cell_type": {"type": "string"}},
    "required": []
  }
}

get_module

{
  "name": "get_module",
  "description": "Pull metadata and top hub genes for one co-expression module by id.",
  "input_schema": {
    "type": "object",
    "properties": {
      "module_id": {"type": "string"},
      "top_n":     {"type": "integer", "default": 20}
    },
    "required": ["module_id"]
  }
}

System prompt fragment

Co-expression modules from hdWGCNA are precomputed. module_id follows the convention <cell_type>:<color> (e.g. T:turquoise); always discover real ids via list_modules. Hub genes are ranked by kME within each module.

Implementation lives at stellar/modules/hdwgcna/__init__.py; mirror the pattern in your own module — see Extending.

Frontend tab

The Network tab appears in the SPA nav when this module is enabled: cell-type filter → color-coded module picker → sortable hub-gene table → optional radial SVG of the top 20 hub genes around a center node labelled with the module color.

FAQ

Do I need the DME table?

No — it's the one optional parquet. Without it, the /api/wgcna/dme route returns 404 and the SPA hides the DME sub-tab.

Can I have modules from multiple cell types?

Yes — cell_type is on every row of wgcna_modules.parquet and the picker filters on it.