Multi-Relational Sheaves for Heterogeneous and Knowledge Graphs
Published:
The Multi-Relational Setting
A knowledge graph is a multi-relational graph: nodes are entities, edges are (entity, relation, entity) triples. Each edge e has a relation type r(e) ∈ R.
Standard approaches handle relation types by:
- R-GCN: separate weight W_r per relation in aggregation
- TransE/RotatE: relation as vector translation or rotation in entity embedding space
- CompGCN: composition of entity and relation embeddings in message passing
None of these have a sheaf-theoretic interpretation. Multi-relational sheaves provide one.
The Multi-Relational Sheaf
A multi-relational cellular sheaf F on a knowledge graph G assigns:
- A stalk F(v) ≅ ℝ^d to each entity v
- A stalk F^r(e) ≅ ℝ^d to each edge e of type r
- Relation-specific restriction maps: F^r_{u▷e} and F^r_{v▷e} for each (u, r, v) triple
The coboundary operator sums over all relation types:
The multi-relational Sheaf Laplacian:
where Δ_{F^r} is the Sheaf Laplacian restricted to edges of type r.
Multi-relational sheaf diffusion:
with learnable per-relation weights α_r ∈ ℝ.
Connection to R-GCN
R-GCN (Schlichtkrull et al., 2018) aggregates as:
where W_r is a relation-specific weight matrix.
R-GCN as a special multi-relational sheaf: Set F^r_{u▷e} = I and F^r_{v▷e} = W_r for all edges of type r. Then the Sheaf Laplacian aggregation becomes:
This is exactly R-GCN’s relation-specific aggregation. Multi-relational sheaves generalise R-GCN by allowing arbitrary restriction maps (not just shared W_r).
Connection to TransE
TransE (Bordes et al., 2013) models relations as translations: for a valid triple (u, r, v), the embedding constraint is:
In sheaf terms: F^r_{u▷e} = I, F^r_{v▷e} = I (identity), with edge stalk shifted by w_r. The consistency condition F^r_{u▷e} x_u = F^r_{v▷e} x_v becomes x_u = x_v (equal entity embeddings) — which doesn’t capture TransE’s translation.
A better sheaf encoding of TransE: use the affine restriction map F^r_{u▷e} x = x + w_r/2 and F^r_{v▷e} x = x − w_r/2. The consistency condition becomes (x_u + w_r/2) = (x_v − w_r/2), i.e., x_v = x_u + w_r — exactly TransE!
Multi-Relational Sheaf GNN Architecture
The full architecture for knowledge graph link prediction:
1. Entity stalk initialisation: h_v^{(0)} = x_v (entity features or learned embeddings)
2. Multi-relational sheaf predictor: For each relation type r, predict relation-specific restriction maps:
3. Multi-relational sheaf Laplacian:
4. Sheaf diffusion: H ← (I − Δ_F^{norm}) H W
5. Link prediction decoder: Score triple (u, r, v) using entity embeddings h_u^{(K)}, h_v^{(K)} and relation embedding w_r:
Relation Type Embeddings
In multi-relational sheaves, relation types have two levels of representation:
- The restriction maps F^r_{v▷e} — encoding the geometric relationship
- The relation embedding w_r ∈ ℝ^d — used in the decoder
The restriction maps are learned by the sheaf predictor MLP and can be different for each (u, v, r) triple — instance-specific relation geometry.
In contrast, standard R-GCN uses a shared W_r for all edges of type r — type-specific but instance-invariant. Multi-relational sheaves subsume R-GCN by allowing instance-specific maps within each relation type.
Inverse Relations in Sheaves
Knowledge graphs often have inverse relations: if (u, r, v) is a triple, then (v, r⁻¹, u) should also be represented. In sheaf terms:
For forward relation r: F^r_{u▷e} = A, F^r_{v▷e} = B
For inverse relation r⁻¹ (the reverse edge): F^{r⁻¹}{v▷e} = B, F^{r⁻¹}{u▷e} = A (same maps, reversed role)
This ensures that the sheaf Laplacian is symmetric — a necessary condition for the positive-semidefinite Sheaf Laplacian construction.
Alternatively: add inverse edges explicitly with separate maps F^{r⁻¹}{v▷e} = (F^r{v▷e})⁻ᵀ (the inverse-transpose), which preserves gauge-equivariance under O(d).
Relation to HGT and HAN
Heterogeneous Graph Transformers (HGT) and HAN use relation-type-specific attention:
- HAN: separate GAT per meta-path
- HGT: type-specific linear projections for Q, K, V in attention
Multi-relational sheaves provide a unified alternative: replace the per-meta-path attention with sheaf diffusion using relation-specific restriction maps. The sheaf framework gives a principled motivation (minimise sheaf Dirichlet energy) that HGT/HAN lack.
References
- Schlichtkrull, M., Kipf, T. N., Bloem, P., van den Berg, R., Titov, I., & Welling, M. (2018). Modeling Relational Data with Graph Convolutional Networks. ESWC 2018 (R-GCN: relation-specific weight matrices — the special case of multi-relational sheaves with shared type-level maps).
- Vashishth, S., Sanyal, S., Nitin, V., & Talukdar, P. (2020). Composition-based Multi-Relational Graph Convolutional Networks. ICLR 2020 (CompGCN: composition of entity and relation embeddings — a different approach to multi-relational GNNs that sheaves subsume).
- Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., & Yakhnenko, O. (2013). Translating Embeddings for Modeling Multi-relational Data. NeurIPS 2013 (TransE: the canonical KG embedding method re-interpreted as an affine sheaf above).
