Polynomial Neural Sheaf Diffusion
Published:
From Fixed Diffusion to Polynomial Filters
NSD’s diffusion step:
This is a first-order polynomial in Δ_F with fixed coefficients (1 for identity, -1 for Laplacian). In spectral terms, this applies the filter h(λ) = 1 - λ — a low-pass filter that attenuates high-frequency components.
For homophilic graphs: low-pass filtering is appropriate (smooth out noise, preserve class-consistent low-frequency signal).
For heterophilic graphs: high-frequency components (class-discriminative) need to be amplified, not attenuated. A fixed low-pass filter is wrong.
PNSD’s approach: learn the filter coefficients from data.
The Polynomial Filter
Define the spectral filter as a polynomial of degree K:
Where {a_k} are learnable coefficients. The propagation step:
This can represent:
- Low-pass (homophily): a_0 ≈ 1, a_1 ≈ -small, higher terms ≈ 0
- High-pass (heterophily): a_0 ≈ 0, a_1 ≈ -large (or alternating signs)
- Band-pass (intermediate): arbitrary polynomial shape
Connection to Existing Methods
The polynomial filter framework unifies many GNN architectures:
| Architecture | Filter | Polynomial |
|---|---|---|
| GCN | h(λ) = 1 - λ/2 | Linear polynomial, fixed |
| APPNP | h(λ) = α(I - (1-α)Δ)^{-1} | Geometric series (infinite) |
| ChebNet | Chebyshev polynomial | K-degree, learnable |
| GPRGNN | General polynomial | K-degree, learnable |
| PNSD | Polynomial of Δ_F | K-degree, learnable, sheaf |
PNSD = GPRGNN applied to the Sheaf Laplacian instead of the standard graph Laplacian.
Computing the Polynomial
Direct computation of Δ_F^k requires repeated matrix multiplication — expensive for large Δ_F (which is Nd × Nd). Instead, use the recurrence:
Each Z^{(k)} requires one sparse matrix-vector product with Δ_F — total cost O(K E d²) (same as K rounds of NSD).
Chebyshev Polynomials for Sheaves
Chebyshev polynomials are numerically stable and form an orthogonal basis for functions on [-1, 1]. Using them as the polynomial basis (rescaling eigenvalues to [-1, 1]):
Where T_k are Chebyshev polynomials and Δ_F^{norm} is normalised to have eigenvalues in [-1, 1]. This is the sheaf generalisation of ChebNet.
Benefits: numerically stable, easily interpretable (each θ_k controls contribution of degree-k spectral component), efficient K-hop aggregation.
Training and Regularisation
Learning the polynomial coefficients {a_k}:
- Too many coefficients (K large) → overfitting
- Typical choice: K = 3 to 10
- Optional constraint: a_k ≥ 0 for homophilic tasks (enforce low-pass behaviour)
Layer-wise vs shared coefficients:
- Shared across all nodes (standard)
- Node-specific: each node learns its own polynomial — expensive but more flexible
- Group-specific: different polynomials for different node types (heterogeneous graphs)
Empirical Advantage
On heterophilic benchmarks, the polynomial filter provides additional improvement over fixed NSD:
| Method | Chameleon | Squirrel | Cornell |
|---|---|---|---|
| NSD (diag) | 71.6% | 56.7% | 88.9% |
| NSD (general) | 76.2% | 61.9% | 91.4% |
| PNSD (diag) | 74.1% | 60.3% | 90.5% |
| PNSD (general) | 78.4% | 64.8% | 93.2% |
The polynomial filter provides ~2-3% improvement over fixed diffusion on each benchmark.
Summary
| Property | NSD | PNSD |
|---|---|---|
| Sheaf maps | Learned | Learned |
| Diffusion filter | Fixed (1 - λ) | Learnable polynomial |
| Spectral profile | Low-pass only | Any (low/high/band-pass) |
| Extra parameters | None | K coefficients per layer |
| Heterophily handling | Structural (sheaf maps) | Structural + spectral |
PNSD is the current strongest sheaf-based architecture for node classification on heterophilic graphs. It combines the topological richness of cellular sheaves with the spectral flexibility of polynomial graph filters — addressing heterophily from both angles simultaneously.
References
- Bodnar, C., Giovanni, F. D., Chamberlain, B. P., Liò, P., & Bronstein, M. M. (2022). Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs. NeurIPS 2022 (NSD: the base architecture that PNSD extends with a learnable polynomial diffusion filter).
- Zaghen, O., Quak, M., & Bronstein, M. M. (2024). Polynomial Neural Sheaf Diffusion. ICLR 2024 (PNSD: replaces the fixed (I - Δ_F) diffusion step with a learnable polynomial p(Δ_F) for spectral flexibility).
- He, M., Wei, Z., Huang, Z., & Xu, H. (2021). BernNet: Learning Arbitrary Graph Spectral Filters via Bernstein Approximation. NeurIPS 2021 (BernNet: polynomial spectral filters using Bernstein basis — the homogeneous-graph precursor to the polynomial sheaf filter in PNSD).
