Open Problems and Future Directions in Sheaf Neural Networks
Published:
Open Problem 1: Scalability
The challenge: NSD and PNSD predict restriction maps via per-edge MLPs — one forward pass per edge. For a graph with E = 10⁷ edges, this requires 10⁷ MLP evaluations per layer, which is too slow for large-scale deployment.
Current approaches:
- Mini-batch training with neighbourhood sampling (GraphSAGE-style): sample K neighbours per node, predict maps only for sampled edges
- Shared maps: instead of per-edge maps, predict a per-node transformation that approximates per-edge maps
- Cluster-GCN style: cluster the graph, predict maps within clusters only
Key research question: Can we design a sheaf GNN that:
- Has O(N) map prediction cost (not O(E))
- Maintains the theoretical guarantees of NSD (heterophily handling, oversmoothing avoidance)
- Achieves competitive accuracy on large heterophilic datasets (Amazon, ogbn-arxiv)?
Promising direction: Predict restriction maps from node features alone (not node pairs): F_{v▷e} = f(h_v) — this gives O(N) map prediction, at the cost of losing the inter-node relational information. Combining this with shared global maps per edge type could balance cost and expressiveness.
Open Problem 2: Stochastic Sheaves
| The challenge: Current sheaf GNNs produce deterministic restriction maps. For uncertainty quantification, we want probabilistic restriction maps: F_{v▷e} ~ p(F | h_u, h_v, G), giving a distribution over sheaves rather than a single sheaf. |
Why this matters:
- In drug discovery, uncertain molecular property predictions are as important as accurate ones
- On heterophilic graphs with noisy labels, some edges may have uncertain relational structure
- For out-of-distribution detection, sheaf uncertainty can signal when the model’s relational assumptions break down
Formulation:
The resulting Sheaf Laplacian Δ_F is random — its distribution determines the uncertainty in the diffusion.
Research questions: What is the distribution of the eigenvalues of Δ_F when the maps are random? How does map uncertainty propagate to prediction uncertainty? Can we train a stochastic sheaf GNN via variational inference?
Open Problem 3: Sheaf Autoencoders
The challenge: Sheaf GNNs are discriminative models — they map input graphs to node representations for classification. Can they be used as generative models — generating graphs with specified sheaf structure?
Sheaf autoencoder: Encode a graph G with sheaf F into a latent code z; decode z to reconstruct (G, F). The decoder must generate both:
- The graph topology (nodes, edges) — standard graph generation challenge
- The sheaf structure (restriction maps) — new challenge
Why sheaf autoencoders? For molecular generation (generating molecules with desired properties), the restriction maps encode bond chemistry. A sheaf autoencoder could generate molecules by specifying the desired restriction map structure — the chemical relationships between atoms — and decoding to a compatible graph.
Research questions: What is the right latent space for sheaf structure? How do we train a sheaf decoder (generating restriction maps from latent codes)? How do we evaluate the quality of generated sheaves?
Open Problem 4: Persistent Sheaf Homology
The challenge: Persistent homology (Edelsbrunner & Harer, 2010) tracks how topological features (connected components, loops) appear and disappear as a filtration parameter increases. Can this be extended to sheaves?
| Persistent sheaf homology: Define a filtration on a sheaf: for each threshold ε, consider the sub-sheaf F_ε consisting only of edges with | (F_{u▷e} − F_{v▷e}) | _F < ε. Track how the global section space H⁰(G, F_ε) changes with ε. |
Potential applications:
- Multi-scale sheaf analysis: different ε values reveal relational structure at different scales
- Sheaf barcodes: persistence diagrams for sheaf features — a topological signature of the graph’s relational geometry
- Comparison of sheaves: the bottleneck distance between sheaf barcodes as a sheaf similarity metric
Research questions: Is persistent sheaf homology stable (small changes in F give small changes in barcodes)? Can it be computed efficiently? Does it provide useful features for graph classification tasks?
Open Problem 5: Unified Sheaf Transformer Theory
The challenge: The theoretical framework for sheaf GNNs (heterophily, oversmoothing, expressiveness) is well-developed for diffusion-based models. Can it be extended to sheaf Transformers?
Specific questions:
- Does a Sheaf Transformer with global attention still avoid oversmoothing? (Global attention may or may not preserve the null space structure)
- Can we prove heterophily handling guarantees for sheaf attention mechanisms?
- What is the expressiveness of a Sheaf Transformer relative to 1-WL and higher-order WL tests?
- Does gauge equivariance (SheafAN-style) survive the addition of global attention?
Open Problem 6: Sheaf Interpretability
The challenge: The restriction maps F_{v▷e} encode the relational geometry of each edge. Can we interpret what the learned maps mean in terms of the underlying domain?
For citation networks: Do the learned maps cluster edges by paper discipline? Do they encode citation direction (citing vs cited) or citation strength?
For molecular graphs: Do the maps cluster by bond type (single/double/triple)? Do they encode electronegativity differences between bonded atoms?
Approach: Visualise the maps as elements of O(d) (for orthogonal maps): each map is a rotation in ℝ^d — its rotation angle and axis can be plotted and compared across edges. If edges of the same type cluster in this rotation space, the maps are interpretable.
Research question: Is there a principled sheaf interpretability method analogous to GNN attribution methods (GNN-Explainer, GradCAM for graphs)?
Open Problem 7: Sheaf Normalisation Flows
The challenge: Normalising flows learn a diffeomorphism from a simple distribution to a complex one. On graphs, this requires handling the non-Euclidean structure. Sheaf-based normalising flows could model distributions over graph-valued data.
Sheaf flow: Learn a time-evolving sheaf F_t such that at t=0 the node stalks follow a simple distribution (e.g., Gaussian), and at t=1 they follow the data distribution. The flow is driven by the sheaf ODE:
where b_t is a bias term (the “source” of the flow). This is a sheaf-based continuous normalising flow.
Open Problem 8: Multi-Scale Sheaf Pooling
The challenge: Graph pooling for graph classification requires aggregating node representations into a single graph representation. Standard pooling (mean, sum, max) loses structural information. Sheaf pooling could use the global section space H⁰(G, F) as the graph representation — a topologically meaningful aggregate.
Sheaf pooling: The global sections form a d·dim(H⁰)-dimensional representation of the graph. This is computed as the null space of Δ_F — a principled global summary.
Hierarchical sheaf pooling: Coarsen the graph by contracting nodes into super-nodes; define restriction maps between the fine and coarse sheaves. This gives a multi-scale sheaf — a new architecture for graph classification with topological guarantees.
A Research Roadmap
Near-term (achievable with current tools):
- Scalable sheaf GNNs via node-level map prediction
- Sheaf interpretability via rotation space visualisation
- Sheaf GNNs on large-scale benchmarks (ogbn-arxiv, ogbn-products)
Medium-term (requires new theory):
- Persistent sheaf homology algorithms
- Sheaf Transformer expressiveness theory
- Stochastic sheaf GNNs via variational inference
Long-term (requires new frameworks):
- Sheaf generative models
- Persistent sheaf learning for graph classification
- Sheaf-based multi-scale learning
References
- Bodnar, C., Giovanni, F. D., Chamberlain, B. P., Liò, P., & Bronstein, M. M. (2022). Neural Sheaf Diffusion. NeurIPS 2022 (the core architecture that all open problems build on).
- Edelsbrunner, H., & Harer, J. (2010). Computational Topology: An Introduction. AMS 2010 (persistent homology — the TDA framework extended to sheaves in open problem 4).
- Curry, J. (2014). Sheaves, Cosheaves and Applications. PhD Thesis, Penn 2014 (sheaf homology and cosheaf theory — mathematical foundation for open problems 3 and 4).
