What Is a Sheaf? From Topology to Graph Learning

6 minute read

Published: June 02, 2025

TL;DR: A sheaf assigns data (vectors, functions, sets) to the parts of a space, plus restriction maps that say how data on larger pieces relates to data on smaller pieces. A global section is a consistent assignment across the whole space — one where all the local pieces agree. On a graph, nodes and edges are the "pieces", restriction maps encode inter-node relationships, and the Sheaf Laplacian measures how inconsistent a signal is.

The Intuition: Consistent Local Information

Imagine a weather network: temperature sensors at every city (nodes), with the understanding that adjacent cities should have related temperatures. A sheaf formalises this by:

Assigning a data space to every city: the set of possible temperature readings.
Assigning a data space to every road between cities: the pair of readings that are “consistent” for that road.
Specifying a restriction map per road that says: “if city u has reading x_u, what does that imply at the road endpoint?”

A global section is an assignment of readings to all cities such that every road’s restriction is satisfied — the whole network is self-consistent.

This is exactly the structure a Sheaf Neural Network learns: not just node features, but the relational geometry between them.

Sheaves in Classical Topology

Historically, sheaves arose in algebraic geometry and topology (Leray, 1940s; Serre, 1950s). A sheaf on a topological space X assigns:

A set (or vector space) F(U) to each open set U ⊆ X
A restriction map ρ_{UV} : F(U) → F(V) for every inclusion V ⊆ U
Consistency axioms: ρ_{UW} = ρ_{VW} ∘ ρ_{UV} for W ⊆ V ⊆ U

The classic example: the sheaf of continuous functions. F(U) = {continuous functions on U}. For V ⊆ U, ρ_{UV}(f) = f

_V (restrict the function to the smaller set). A global section is a function defined consistently everywhere on X.

From Topology to Graphs: Cellular Sheaves

A graph G = (V, E) is a 1-dimensional CW complex: a topological space built from 0-cells (nodes) and 1-cells (edges). A cellular sheaf F on G assigns:

A vector space F(v) ≅ ℝ^{d_v} to each node v — called the node stalk
A vector space F(e) ≅ ℝ^{d_e} to each edge e — called the edge stalk
A linear map F_{v→e} : F(v) → F(e) for each incidence (v is an endpoint of e) — the restriction map

F = { F(v), F(e), F_{v→e} : v ∈ V, e ∈ E, v incident to e }

No consistency axioms are needed for a 1-dimensional CW complex (they would involve 2-cells, which a graph doesn’t have). The simplicity is what makes cellular sheaves tractable for graph learning.

What the Restriction Maps Encode

The restriction map F_{v→e} : ℝ^d → ℝ^d (assuming equal stalk dimensions) describes how node v’s data “projects onto” the shared edge. The two restriction maps for edge e = (u, v) — namely F_{u→e} and F_{v→e} — describe how u and v respectively relate to the shared edge.

Special cases:

Restriction maps	Interpretation
F_{v→e} = I (identity) for all (v, e)	Standard graph: nodes should be equal
F_{v→e} ∈ {+I, −I}	Signed edges: nodes should agree or disagree
F_{v→e} = diag(d₁,…,d_d)	Feature-wise scaling: different channels scale differently
F_{v→e} ∈ O(d)	Orthogonal rotation: nodes related by a rotation
F_{v→e} ∈ ℝ^{d×d}	General linear: arbitrary learned relationship

Key insight: Standard GCN's message passing is a sheaf neural network with all restriction maps fixed to the identity. The "oversmoothing" problem is exactly the consequence of this assumption: the Sheaf Laplacian with identity maps forces all node features to agree, collapsing to constants. By learning richer restriction maps, sheaf GNNs can represent the relational structure of heterophilic graphs — where connected nodes should differ in a *structured way*, not randomly.

Global Sections: When Everything Agrees

A 0-cochain is a collection of vectors x = (x_v)_{v∈V} with x_v ∈ F(v) — an assignment of data to every node.

A global section is a 0-cochain where every restriction is satisfied: for every edge e = (u, v),

F_{u→e} x_u = F_{v→e} x_v

In words: the data at u, projected to the edge stalk, equals the data at v, projected to the edge stalk. The system is globally consistent.

The space of global sections, denoted H⁰(G, F) = ker(δ₀), is the analogue of the null space of the standard graph Laplacian. For standard GCN, H⁰ consists of constant functions. For a sheaf with non-trivial maps, H⁰ is much richer — it can contain functions that vary across nodes while still satisfying the relational constraints imposed by the restriction maps.

The Coboundary and Inconsistency

The coboundary operator δ₀ : C⁰(G, F) → C¹(G, F) maps node-level data to edge-level disagreement:

(δ₀ x)_e = F_{v→e} x_v − F_{u→e} x_u

for edge e = (u, v) with a chosen orientation. The disagreement (δ₀ x)_e ∈ F(e) measures how inconsistent x is at edge e. The Sheaf Laplacian Δ_F = δ₀ᵀδ₀ accumulates this inconsistency into a node-level operator:

Δ_F = δ₀ᵀ δ₀

This is the central operator for sheaf-based graph learning. Its null space is exactly the space of global sections.

The Three Key Objects

Object	Symbol	Role in graph learning
Node stalk	F(v) ≅ ℝ^d	Feature space at each node
Restriction map	F_{v→e} ∈ ℝ^{d×d}	Relational geometry per edge
Sheaf Laplacian	Δ_F = δ₀ᵀδ₀	Aggregation operator (replaces L)

The Sheaf Laplacian Δ_F is an (Nd) × (Nd) block matrix, where N is the number of nodes. Its (u, v)-block (for an edge e = (u,v)) is:

[Δ_F]_{uv} = −F_{u→e}ᵀ F_{v→e} [Δ_F]_{uu} = Σ_{e incident to u} F_{u→e}ᵀ F_{u→e}

When all restriction maps are the identity: [Δ_F]{uv} = −I (if u∼v), [Δ_F]{uu} = deg(u)·I — exactly d copies of the standard graph Laplacian.

Summary

A sheaf is not exotic mathematics — it is a principled framework for attaching structured data to a space and asking when that data is globally consistent. On a graph:

Node stalks hold feature vectors
Restriction maps encode the relational geometry between adjacent nodes
Global sections are the self-consistent configurations — the null space of the Sheaf Laplacian
The Sheaf Laplacian measures inconsistency and drives diffusion

Everything else in this series — architectures, theory, applications — is a consequence of this foundational structure.

References

Hansen, J., & Gebhart, T. (2020). Sheaf Neural Networks. NeurIPS 2020 GRL+ Workshop (the first paper to apply cellular sheaves to GNNs).
Ghrist, R. (2014). Elementary Applied Topology. Createspace 2014 (ch. 5 covers cellular sheaves with graph examples and is freely available).
Curry, J. (2014). Sheaves, Cosheaves and Applications. PhD Thesis, Penn 2014 (mathematical foundation; ch. 2–3 are the relevant sections for graph learning).

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Alessio Borgi

What Is a Sheaf? From Topology to Graph Learning

The Intuition: Consistent Local Information

Sheaves in Classical Topology

From Topology to Graphs: Cellular Sheaves

What the Restriction Maps Encode

Global Sections: When Everything Agrees

The Coboundary and Inconsistency

The Three Key Objects

Summary

References

Share on

You May Also Enjoy

Open Problems and Future Directions in Sheaf Neural Networks

Sheaf GNNs for Molecular Property Prediction

Sheaf GNNs for Heterophilic Node Classification: Benchmarks and Results

Sheaves Meet Attention: Transformer-Inspired Sheaf Models