Equivariant Sheaf Neural Networks

5 minute read

Published:

TL;DR: When restriction maps are orthogonal matrices, the sheaf defines a "gauge connection" on the graph — a rule for parallel transporting vectors between nodes along edges. The resulting Sheaf Laplacian is the Connection Laplacian, which has rich symmetry properties. This framework unifies sheaf GNNs with geometric deep learning on graphs.

From Sheaves to Connections

A cellular sheaf with orthogonal restriction maps O_{u→e} ∈ O(d) defines a principal O(d)-bundle on the graph — each node has a “local coordinate frame,” and the edge maps specify how to transform vectors from u’s frame to the edge’s frame (and from there to v’s frame).

The holonomy around a cycle in the graph is the composition of edge maps along the cycle. If the sheaf is flat (holonomy = identity around all cycles), global sections exist and the sheaf is consistent. Non-trivial holonomy indicates global geometric structure — like parallel transport on a curved manifold.

The Connection Laplacian

With orthogonal maps O_{u→e} and O_{v→e}, the Sheaf Laplacian has a special form called the Connection Laplacian:

(L_C)_{uv} = -O^T_{u→e} O_{v→e} for edge e = (u,v) (L_C)_{vv} = deg(v) · I_d

The off-diagonal block -O^T_{u→e} O_{v→e} = -O_{u←e} O_{v→e} is itself an orthogonal matrix (product of orthogonals). It represents the parallel transport map from v’s frame to u’s frame via edge e.

Key property: the Connection Laplacian is positive semi-definite with eigenvalues in [0, 2d]. Its null space consists of parallel sections — vector fields that are “constant” under parallel transport.

Gauge Symmetry

A gauge transformation at node v is a local change of coordinate frame — applying an orthogonal transformation g_v ∈ O(d) to all features at v. Under this transformation:

  • Node features: x_v → g_v x_v
  • Restriction maps: O_{v→e} → O_{v→e} g_v^{-1} (compensate to keep edge consistency)
  • Sheaf Laplacian: L_C → (block-diag g) L_C (block-diag g)^{-1}

The gauge-invariant quantities are independent of the choice of local frame:

  • Edge holonomies (parallel transport around cycles)
  • Eigenvalues of L_C
  • Norms O_{u→e} x_u - O_{v→e} x_v 

A truly equivariant sheaf GNN should produce outputs that are gauge-invariant (for graph-level tasks) or gauge-equivariant (for node-level tasks).

The physics analogy: This is exactly the structure of gauge theories in physics. Electromagnetism is a U(1) gauge theory on spacetime — each point has a local phase, and the electromagnetic field is the "connection" that relates phases at different points. NSD with orthogonal maps is an O(d) gauge theory on a graph. The Sheaf Laplacian is the discrete analogue of the Yang-Mills Laplacian. This is not just an analogy — the mathematical structures are identical.

Equivariant Sheaf GNN Layers

A gauge-equivariant layer must use only gauge-invariant quantities when computing messages:

Gauge-invariant quantities at edge (u,v):

  •  x_u , x_v (norms)
  • x_u^T O_{u→e}^T O_{v→e} x_v (inner product after transport)
  •  O_{u→e} x_u - O_{v→e} x_v ² (disagreement = Sheaf Dirichlet energy at this edge)

Gauge-equivariant output:

  • O_{v→e} x_v (transported feature) — transforms as g_v x_v under gauge transformation at v
  • O_{u→e}^T O_{v→e} x_v (v’s feature in u’s frame) — gauge-equivariant at u

A complete equivariant sheaf layer:

x_v ← φ( x_v, Σ_{u ∈ N(v)} O^T_{u→e} O_{v→e} x_u )

Where φ is any function (can be an MLP). The input is gauge-equivariant at v, so the output is gauge-equivariant.

Connection to Equivariant GNNs for 3D Data

The geometric deep learning framework (EGNN, SE(3)-Transformers, TFN) handles E(n)/SE(3) equivariance for 3D point clouds. Sheaf GNNs with O(d) restriction maps handle O(d) gauge equivariance on abstract graphs.

The mathematical structures are parallel:

  • 3D equivariant GNNs: equivariant under the rotation group SO(3) acting globally
  • Sheaf GNNs: equivariant under gauge group O(d) acting locally (different transformation at each node)

Sheaf gauge equivariance is strictly stronger than global equivariance — it requires equivariance under independent transformations at each node, not just a single global rotation.

Applications

Point clouds with local frames: each point has a local coordinate frame (e.g., surface normal + tangent plane). Sheaf GNNs with orthogonal maps can process features in local frames and aggregate them correctly — analogous to gauge-equivariant neural networks on meshes.

Protein structure: each residue has a local frame (N-Cα-C backbone). The sheaf maps encode how to transform between residue frames along peptide bonds.

Graph signal processing: the Connection Laplacian generalises the standard graph Laplacian to vector-valued signals with local frame structure.

Summary

ConceptSheaf languageGeometry language
Orthogonal restriction mapsO_{u→e} ∈ O(d)Parallel transport maps
Sheaf Laplacian (orthogonal case)Δ_F with O(d) mapsConnection Laplacian L_C
Global sectionsker(δ₀)Parallel sections
HolonomyProduct of maps around cycleCurvature of connection
Gauge transformationLocal O(d) at each nodeChange of local frame

Equivariant sheaf GNNs sit at the intersection of algebraic topology, differential geometry, and graph learning — providing a principled framework for processing data with local frame structure on graphs.

References