Case Study

Causal network analysis
running in the browser.

A Rust/WASM graph analysis engine designed to map and analyze causal relationships across biological mechanisms, with every algorithm running in the browser without blocking the UI.

6+
Analysis algorithms
17
Relation types
7
Causal confidence levels
4
Export formats
01

The Problem

The research model tracks how multiple neurodegenerative diseases converge on related iron-driven cell-death pathways. That model is a directed graph: nodes are biological mechanisms, and edges are causal relationships with varying levels of evidence behind them.

Analyzing that graph means answering questions like: which mechanism has the most downstream influence? What's the strongest evidence path between iron dysregulation and oligodendrocyte death? Where are the reinforcing feedback loops that make the cascade self-sustaining? Which node, if disrupted, causes the most damage to the network?

Existing graph tools either required desktop software (Gephi, Cytoscape), couldn't handle causal confidence as edge weights, or blocked the browser's main thread during computation. I needed something that ran natively in the browser, treated evidence quality as a first-class concept, and stayed responsive while crunching a network with hundreds of relationships.

The graph isn't the visualization.
The graph is the analysis.

02

The Framework

Before writing a line of Rust, I needed a formal system for representing disease mechanisms. Existing standards didn't fit: BEL (Biological Expression Language) captured causal semantics but not stock-flow dynamics. SBGN had visual notation but no computational model for path analysis. Systems dynamics frameworks tracked flows but lacked biological entity classification.

So I designed the Systems Biology Stock-Flow (SBSF) framework, combining Donella Meadows' stock-flow thinking with BEL causal semantics, SBGN entity classification, and the OBO Relation Ontology. The result is a typed graph representation where every node carries biological meaning and every edge carries evidence provenance, and where the graph structure itself enables computational reasoning about disease mechanisms.

Typed Nodes

Four categories—stocks (accumulations like protein levels), states (qualitative conditions like cell phenotype), processes (dynamic flows like phagocytosis), and boundaries (system edges like genes or drug interventions)—with deep subtypes that preserve biological semantics.

Evidence-Weighted Edges

Seven causal confidence levels (L1–L7) tied to experimental methods: randomized controlled trials at L1, genetic knockouts at L3, in vitro assays at L5, case reports at L7. These map directly to mathematical weights for path analysis.

Quantitative Annotations

Edges can carry stoichiometric coefficients, kinetic parameters (Km, Vmax, IC50), effect sizes, and dose-response curves—enabling rough calculations through causal chains rather than just qualitative reasoning.

Interoperable Export

Four export formats—NetworkX JSON for Python workflows, GraphML for Cytoscape and yEd, GEXF for Gephi, and CSV for spreadsheet analysis—so the same graph moves between browser, desktop tools, and programmatic pipelines.

03

The Architecture

The engine is a three-layer system. Rust handles all graph algorithms, compiled to a 331KB WASM binary. A Web Worker loads that binary and runs computation in a background thread so the UI never freezes. A React hook wraps the worker with a promise-based API that feels like calling any other async function.

A user can trigger a betweenness centrality calculation on the full network, keep interacting with the page, and see the results when they land. No loading spinners, no frozen tabs.

Rust Core

petgraph-backed graph engine compiled to WebAssembly with wasm-bindgen. All algorithms run in pure Rust with zero JavaScript dependencies in the computation layer.

Web Worker

Message-driven interface that dynamically loads the WASM binary and runs all computation off the main thread. Twenty message types with async request/response ID tracking.

React Hook

useGraph() manages the full worker lifecycle, exposes promise-based methods for every algorithm, and supports automatic layout on mount with useCallback optimization.

Not all evidence is equal.

Every edge in the graph carries a causal confidence level from L1 (direct experimental manipulation) to L7 (theoretical). These levels map to mathematical weights that shape path analysis: stronger evidence means shorter distance and higher strength.

L1
Randomized controlled trial
L2
Mendelian randomization
L3
Genetic knockout / knock-in
L4
Animal or human intervention
L5
In vitro / ex vivo mechanistic
L6
Cohort or case-control
L7
Cross-sectional / case report
04

The Analysis

The engine exposes six categories of graph analysis, each designed for a specific research question. All run in the Web Worker and return structured results that the React layer can render immediately.

Centrality

Degree, betweenness (Brandes algorithm with weighted Dijkstra variant), harmonic closeness for disconnected graphs, and PageRank with configurable damping.

Path Analysis

Shortest path (BFS and Dijkstra), strongest path maximizing minimum confidence along the route, all simple paths with bounded enumeration, and neighborhood exploration.

Feedback Loops

Tarjan’s SCC-based cycle detection that classifies each loop as reinforcing or balancing, with minimum confidence scoring across edges.

Community Detection

Label propagation algorithm identifying clusters of tightly connected mechanisms, with modularity scoring and cross-module connectivity matrices.

Robustness

Systematic node removal simulation that ranks which mechanisms, if disrupted, cause the greatest connectivity loss across the network.

Sugiyama Layout

Hierarchical graph drawing with longest-path layering, ghost node insertion for clean multi-layer edges, and barycentric crossing minimization.

05

The Data Model

The SBSF framework defines 17 relation types (increases, decreases, produces, degrades, binds, transports, traps, protects, disrupts, and more) that capture the semantics of mechanistic biology rather than flattening everything to generic edges. The engine knows that “traps” and “degrades” are inhibitory, so feedback loop polarity is derived automatically from edge composition.

When quantitative data exists, edges carry stoichiometric coefficients (reactant and product ratios), kinetic parameters (Km, Vmax, IC50), or effect sizes with confidence intervals. The engine traces how much A affects B, turning a qualitative causal map into a semi-quantitative reasoning tool.

Generic graph tools

Mechanistic graph

Nodes and edges with optional labels
Typed nodes (stock, state, boundary, process) with semantic meaning
Single edge weight (numeric)
Causal confidence (L1–L7) mapping to both strength and distance weights
Layout optimizes for aesthetics
Sugiyama layout respects information flow direction in causal networks
Analysis treats all paths equally
Strongest-path algorithm finds the most evidence-supported causal chain
06

Graph Organization

A causal network with hundreds of nodes resists flat layout. Sugiyama handles hierarchy within a single component, but real disease models have semi-independent modules (iron metabolism, neuroinflammation, myelin dynamics) that overlap at their boundaries. Laying everything out in one pass means biological structure disappears into edge-crossing minimization.

Spectral clustering partitions the network before layout touches it. The engine builds the graph Laplacian, decomposes it via power iteration, and uses the eigengap heuristic to detect how many clusters the network naturally contains. Each cluster gets its own Sugiyama layout, then a meta-graph positions clusters relative to each other. This produces a two-level hierarchy that preserves biological modularity as the network grows.

Graph Laplacian

Symmetric Laplacian L = D − A built from the directed graph, decomposed via a shifted-matrix power-iteration trick with Gram–Schmidt orthogonalization.

Eigengap Heuristic

Scans consecutive eigenvalue gaps to auto-detect the optimal cluster count k. The largest gap marks where the network’s natural community boundaries lie.

Two-Level Layout

Sugiyama runs independently per cluster. A meta-graph of inter-cluster edges determines global positioning, and the final layout composites internal coordinates with cluster offsets.

The engine reports diagnostics after each partition: module agreement scores measuring how well spectral clusters align with biologically defined modules, cross-cluster edge ratios, and the full eigenvalue spectrum. Unexpected groupings surface immediately.

07

Filters & Views

A network of this density is unusable without interaction. The interface layers five independent filter systems that compose freely: examine strong-evidence edges within a single module's upstream pathway while keeping the rest of the network visible but dimmed.

Module Filters

Three-state controls per biological category—upstream, core, downstream, therapeutic, boundary. Full visibility, dimmed partial view, or hidden, without destroying graph structure.

Evidence Levels

Three confidence thresholds: strong (L1–L3), moderate (L1–L5), or all (L1–L7). Higher thresholds surface only relationships backed by direct experimental manipulation.

Layout Modes

Flat Sugiyama or hierarchical clustered layout, each in top-to-bottom or left-to-right orientation. Hierarchical mode uses spectral clustering to group related mechanisms.

Edge Visibility

Toggle feedback loops on or off, hide transitive redundancies where a direct edge is implied by a stronger indirect path, and remove orphan nodes left isolated by other filters.

Focus Mode

Click any node to trace its upstream and downstream causal chain. Everything outside the pathway dims to 20% opacity, isolating the relevant path within full network context.

Same network, different patient.

08

Boundary Parameters

Boundary nodes represent inputs to the disease system: genetic variants, demographic factors, and lifestyle variables that aren't caused by other mechanisms in the graph but shape everything downstream. Unlike internal nodes, each boundary carries a set of parameterized variants drawn from published epidemiological data.

Seven boundary nodes define 25 total variants. Each carries an effect direction (protective, neutral, or risk), a magnitude multiplier, and where available, odds ratios with confidence intervals tied to specific PubMed citations. Selecting a different variant modulates the weights of all outgoing edges from that boundary, propagating the change through the entire causal network.

APOE Genotype

4 variants

ε4 homozygous → OR 12.0 (risk)

TREM2 Variants

4 variants

R47H → OR 2.9, CI 2.2–3.8

Aging

4 variants

85+ vs <65 age brackets

Biological Sex

2 variants

Female ↔ Male baseline

Familial AD Mutations

4 variants

APP, PSEN1, PSEN2 carriers

Sleep Quality

4 variants

Normal → sleep apnea spectrum

Menopausal Status

3 variants

Pre, peri, post transitions

This turns the static causal map into something closer to a simulator: set the patient's APOE genotype to ε4 homozygous and TREM2 to R47H, and the downstream path strengths shift to reflect that specific genetic profile. Same network, different risk landscape.

Every edge carries the weight of its evidence.

09

What I Learned

I learned when to reach for a systems language.

I initially tried the graph algorithms in TypeScript. They worked for small networks but froze the browser above 200 nodes. Moving to Rust/WASM dropped the same computation from seconds to milliseconds. The lesson was knowing when JavaScript is the wrong tool and having the skills to reach for something lower-level. I now evaluate performance requirements early and choose the language accordingly.

I learned why I build custom tools instead of using existing ones.

Generic graph libraries (Cytoscape, D3-force) optimize for general visualization. This project needed domain-specific semantics: causal confidence levels, typed relations, boundary parameters that reshape the entire network. Every time I tried to adapt a general tool, I spent more time fighting its abstractions than building the feature. The custom engine was more work upfront and less work for every feature after.

I learned that the API surface is the user experience.

The React hook that wraps the WASM worker is what other developers would actually use. I spent as much time designing that interface—what it accepts, what it returns, how it handles loading states—as I did on the graph algorithms themselves. A powerful engine behind a confusing API is a product nobody ships.

Built with

RustWebAssemblypetgraphwasm-bindgenTypeScriptReact 18/19Web WorkersXY Flowvitest