Turbo-GNN

Summary

Turbo-GNN is the official implementation released with On Efficient Scaling of GNNs via IO-Aware Layers Implementations. The official repository describes custom CUDA and Triton kernels, cuSPARSE-backed sparse aggregation paths, and Python/PyTorch interfaces for faster GNN layer execution, including SpMM aggregation, reduction aggregation, GATv2 aggregation, and Graph Transformer aggregation.

Role In The Wiki

Turbo-GNN is an implementation artifact rather than a model family. Its local role is to raise the baseline floor for graph time-series and graph-control experiments: if a GNN baseline is slow because it materializes edge-wise tensors or relies on unoptimized framework defaults, that is not decisive evidence against graph-aware modeling.

For time-series and world-model work, Turbo-GNN is useful when the experiment needs direct message passing over topology: service graphs, power-grid topology, graph observability benchmarks, graph neural surrogates, or graph-attention baselines. It is not itself evidence for action-conditioned world modeling, counterfactual prediction, or latent-state maintenance.

Official Artifacts

Paper: On Efficient Scaling of GNNs via IO-Aware Layers Implementations
Code: yandex-research/On-Efficient-Scaling-Of-GNNs
Package name from the official repository: turbo-gnn
Official blog: Yandex Research blog post

What It Exposes

spmm_aggr for SpMM-style aggregation backed by cuSPARSE-oriented execution.
reduction_aggr for min/max-style aggregation with degree-aware heavy-node handling.
gatv2_aggr for GATv2-style fused attention aggregation.
graph_transformer_aggr for Graph Transformer-style neighborhood attention.
Autotuning hooks for custom kernels where graph shape and feature shape change performance.

Practical Use In Our Experiments

Use Turbo-GNN or equivalent modern kernels when a graph baseline is meant to answer a modeling question rather than only demonstrate framework overhead. For example:

ChronoGraph-style graph multivariate time-series forecasting should compare graph encoders under matched latency and memory budgets.
Kubernetes/OpenTelemetry control experiments should report whether direct message-passing baselines use fused graph attention or materialized edge tensors.
Grid2Op/power-grid graph surrogates should distinguish model error from sparse-kernel overhead.

Evidence

On Efficient Scaling of GNNs via IO-Aware Layers Implementations

Alex Open Research Wiki

Explorer

Turbo-GNN

Turbo-GNN

Summary

Role In The Wiki

Official Artifacts

What It Exposes

Practical Use In Our Experiments

Evidence

Graph View

Table of Contents

Backlinks

Alex Open Research Wiki

Explorer

Turbo-GNN

Turbo-GNN

Summary

Role In The Wiki

Official Artifacts

What It Exposes

Practical Use In Our Experiments

Evidence

Related Pages

Graph View

Table of Contents

Backlinks