Case Study
Causal network analysis
running in the browser.
A Rust/WASM graph analysis engine designed to map and analyze causal relationships across biological mechanisms, with every algorithm running in the browser without blocking the UI.
The Problem
The research model tracks how multiple neurodegenerative diseases converge on related iron-driven cell-death pathways. That model is a directed graph: nodes are biological mechanisms, and edges are causal relationships with varying levels of evidence behind them.
Analyzing that graph means answering questions like: which mechanism has the most downstream influence? What's the strongest evidence path between iron dysregulation and oligodendrocyte death? Where are the reinforcing feedback loops that make the cascade self-sustaining? Which node, if disrupted, causes the most damage to the network?
Existing graph tools either required desktop software (Gephi, Cytoscape), couldn't handle causal confidence as edge weights, or blocked the browser's main thread during computation. I needed something that ran natively in the browser, treated evidence quality as a first-class concept, and stayed responsive while crunching a network with hundreds of relationships.
The graph isn't the visualization.
The graph is the analysis.
The Framework
Before writing a line of Rust, I needed a formal system for representing disease mechanisms. Existing standards didn't fit: BEL (Biological Expression Language) captured causal semantics but not stock-flow dynamics. SBGN had visual notation but no computational model for path analysis. Systems dynamics frameworks tracked flows but lacked biological entity classification.
So I designed the Systems Biology Stock-Flow (SBSF) framework, combining Donella Meadows' stock-flow thinking with BEL causal semantics, SBGN entity classification, and the OBO Relation Ontology. The result is a typed graph representation where every node carries biological meaning and every edge carries evidence provenance, and where the graph structure itself enables computational reasoning about disease mechanisms.
Typed Nodes
Four categories—stocks (accumulations like protein levels), states (qualitative conditions like cell phenotype), processes (dynamic flows like phagocytosis), and boundaries (system edges like genes or drug interventions)—with deep subtypes that preserve biological semantics.
Evidence-Weighted Edges
Seven causal confidence levels (L1–L7) tied to experimental methods: randomized controlled trials at L1, genetic knockouts at L3, in vitro assays at L5, case reports at L7. These map directly to mathematical weights for path analysis.
Quantitative Annotations
Edges can carry stoichiometric coefficients, kinetic parameters (Km, Vmax, IC50), effect sizes, and dose-response curves—enabling rough calculations through causal chains rather than just qualitative reasoning.
Interoperable Export
Four export formats—NetworkX JSON for Python workflows, GraphML for Cytoscape and yEd, GEXF for Gephi, and CSV for spreadsheet analysis—so the same graph moves between browser, desktop tools, and programmatic pipelines.
The Architecture
The engine is a three-layer system. Rust handles all graph algorithms, compiled to a 331KB WASM binary. A Web Worker loads that binary and runs computation in a background thread so the UI never freezes. A React hook wraps the worker with a promise-based API that feels like calling any other async function.
A user can trigger a betweenness centrality calculation on the full network, keep interacting with the page, and see the results when they land. No loading spinners, no frozen tabs.
Rust Core
petgraph-backed graph engine compiled to WebAssembly with wasm-bindgen. All algorithms run in pure Rust with zero JavaScript dependencies in the computation layer.
Web Worker
Message-driven interface that dynamically loads the WASM binary and runs all computation off the main thread. Twenty message types with async request/response ID tracking.
React Hook
useGraph() manages the full worker lifecycle, exposes promise-based methods for every algorithm, and supports automatic layout on mount with useCallback optimization.
Not all evidence is equal.
Every edge in the graph carries a causal confidence level from L1 (direct experimental manipulation) to L7 (theoretical). These levels map to mathematical weights that shape path analysis: stronger evidence means shorter distance and higher strength.
The Analysis
The engine exposes six categories of graph analysis, each designed for a specific research question. All run in the Web Worker and return structured results that the React layer can render immediately.
Centrality
Degree, betweenness (Brandes algorithm with weighted Dijkstra variant), harmonic closeness for disconnected graphs, and PageRank with configurable damping.
Path Analysis
Shortest path (BFS and Dijkstra), strongest path maximizing minimum confidence along the route, all simple paths with bounded enumeration, and neighborhood exploration.
Feedback Loops
Tarjan’s SCC-based cycle detection that classifies each loop as reinforcing or balancing, with minimum confidence scoring across edges.
Community Detection
Label propagation algorithm identifying clusters of tightly connected mechanisms, with modularity scoring and cross-module connectivity matrices.
Robustness
Systematic node removal simulation that ranks which mechanisms, if disrupted, cause the greatest connectivity loss across the network.
Sugiyama Layout
Hierarchical graph drawing with longest-path layering, ghost node insertion for clean multi-layer edges, and barycentric crossing minimization.
The Data Model
The SBSF framework defines 17 relation types (increases, decreases, produces, degrades, binds, transports, traps, protects, disrupts, and more) that capture the semantics of mechanistic biology rather than flattening everything to generic edges. The engine knows that “traps” and “degrades” are inhibitory, so feedback loop polarity is derived automatically from edge composition.
When quantitative data exists, edges carry stoichiometric coefficients (reactant and product ratios), kinetic parameters (Km, Vmax, IC50), or effect sizes with confidence intervals. The engine traces how much A affects B, turning a qualitative causal map into a semi-quantitative reasoning tool.
Generic graph tools
Mechanistic graph
Graph Organization
A causal network with hundreds of nodes resists flat layout. Sugiyama handles hierarchy within a single component, but real disease models have semi-independent modules (iron metabolism, neuroinflammation, myelin dynamics) that overlap at their boundaries. Laying everything out in one pass means biological structure disappears into edge-crossing minimization.
Spectral clustering partitions the network before layout touches it. The engine builds the graph Laplacian, decomposes it via power iteration, and uses the eigengap heuristic to detect how many clusters the network naturally contains. Each cluster gets its own Sugiyama layout, then a meta-graph positions clusters relative to each other. This produces a two-level hierarchy that preserves biological modularity as the network grows.
Graph Laplacian
Symmetric Laplacian L = D − A built from the directed graph, decomposed via a shifted-matrix power-iteration trick with Gram–Schmidt orthogonalization.
Eigengap Heuristic
Scans consecutive eigenvalue gaps to auto-detect the optimal cluster count k. The largest gap marks where the network’s natural community boundaries lie.
Two-Level Layout
Sugiyama runs independently per cluster. A meta-graph of inter-cluster edges determines global positioning, and the final layout composites internal coordinates with cluster offsets.
The engine reports diagnostics after each partition: module agreement scores measuring how well spectral clusters align with biologically defined modules, cross-cluster edge ratios, and the full eigenvalue spectrum. Unexpected groupings surface immediately.
Filters & Views
A network of this density is unusable without interaction. The interface layers five independent filter systems that compose freely: examine strong-evidence edges within a single module's upstream pathway while keeping the rest of the network visible but dimmed.
Module Filters
Three-state controls per biological category—upstream, core, downstream, therapeutic, boundary. Full visibility, dimmed partial view, or hidden, without destroying graph structure.
Evidence Levels
Three confidence thresholds: strong (L1–L3), moderate (L1–L5), or all (L1–L7). Higher thresholds surface only relationships backed by direct experimental manipulation.
Layout Modes
Flat Sugiyama or hierarchical clustered layout, each in top-to-bottom or left-to-right orientation. Hierarchical mode uses spectral clustering to group related mechanisms.
Edge Visibility
Toggle feedback loops on or off, hide transitive redundancies where a direct edge is implied by a stronger indirect path, and remove orphan nodes left isolated by other filters.
Focus Mode
Click any node to trace its upstream and downstream causal chain. Everything outside the pathway dims to 20% opacity, isolating the relevant path within full network context.
Same network, different patient.
Boundary Parameters
Boundary nodes represent inputs to the disease system: genetic variants, demographic factors, and lifestyle variables that aren't caused by other mechanisms in the graph but shape everything downstream. Unlike internal nodes, each boundary carries a set of parameterized variants drawn from published epidemiological data.
Seven boundary nodes define 25 total variants. Each carries an effect direction (protective, neutral, or risk), a magnitude multiplier, and where available, odds ratios with confidence intervals tied to specific PubMed citations. Selecting a different variant modulates the weights of all outgoing edges from that boundary, propagating the change through the entire causal network.
APOE Genotype
4 variantsε4 homozygous → OR 12.0 (risk)
TREM2 Variants
4 variantsR47H → OR 2.9, CI 2.2–3.8
Aging
4 variants85+ vs <65 age brackets
Biological Sex
2 variantsFemale ↔ Male baseline
Familial AD Mutations
4 variantsAPP, PSEN1, PSEN2 carriers
Sleep Quality
4 variantsNormal → sleep apnea spectrum
Menopausal Status
3 variantsPre, peri, post transitions
This turns the static causal map into something closer to a simulator: set the patient's APOE genotype to ε4 homozygous and TREM2 to R47H, and the downstream path strengths shift to reflect that specific genetic profile. Same network, different risk landscape.
Every edge carries the weight of its evidence.
What I Learned
I learned when to reach for a systems language.
I initially tried the graph algorithms in TypeScript. They worked for small networks but froze the browser above 200 nodes. Moving to Rust/WASM dropped the same computation from seconds to milliseconds. The lesson was knowing when JavaScript is the wrong tool and having the skills to reach for something lower-level. I now evaluate performance requirements early and choose the language accordingly.
I learned why I build custom tools instead of using existing ones.
Generic graph libraries (Cytoscape, D3-force) optimize for general visualization. This project needed domain-specific semantics: causal confidence levels, typed relations, boundary parameters that reshape the entire network. Every time I tried to adapt a general tool, I spent more time fighting its abstractions than building the feature. The custom engine was more work upfront and less work for every feature after.
I learned that the API surface is the user experience.
The React hook that wraps the WASM worker is what other developers would actually use. I spent as much time designing that interface—what it accepts, what it returns, how it handles loading states—as I did on the graph algorithms themselves. A powerful engine behind a confusing API is a product nobody ships.
Built with