Turbo-GNN

Summary

Turbo-GNN is the official implementation released with On Efficient Scaling of GNNs via IO-Aware Layers Implementations. The official repository describes custom CUDA and Triton kernels, cuSPARSE-backed sparse aggregation paths, and Python/PyTorch interfaces for faster GNN layer execution, including SpMM aggregation, reduction aggregation, GATv2 aggregation, and Graph Transformer aggregation.

Role In The Wiki

Turbo-GNN is an implementation artifact rather than a model family. Its local role is to raise the baseline floor for graph time-series and graph-control experiments: if a GNN baseline is slow because it materializes edge-wise tensors or relies on unoptimized framework defaults, that is not decisive evidence against graph-aware modeling.

For time-series and world-model work, Turbo-GNN is useful when the experiment needs direct message passing over topology: service graphs, power-grid topology, graph observability benchmarks, graph neural surrogates, or graph-attention baselines. It is not itself evidence for action-conditioned world modeling, counterfactual prediction, or latent-state maintenance.

Official Artifacts

What It Exposes

  • spmm_aggr for SpMM-style aggregation backed by cuSPARSE-oriented execution.
  • reduction_aggr for min/max-style aggregation with degree-aware heavy-node handling.
  • gatv2_aggr for GATv2-style fused attention aggregation.
  • graph_transformer_aggr for Graph Transformer-style neighborhood attention.
  • Autotuning hooks for custom kernels where graph shape and feature shape change performance.

Practical Use In Our Experiments

Use Turbo-GNN or equivalent modern kernels when a graph baseline is meant to answer a modeling question rather than only demonstrate framework overhead. For example:

  • ChronoGraph-style graph multivariate time-series forecasting should compare graph encoders under matched latency and memory budgets.
  • Kubernetes/OpenTelemetry control experiments should report whether direct message-passing baselines use fused graph attention or materialized edge tensors.
  • Grid2Op/power-grid graph surrogates should distinguish model error from sparse-kernel overhead.

Evidence