R-GCN: Relational Graph Convolutional Networks
Published:
The Problem: Multi-Relational Graphs
A knowledge graph (KG) is a directed multigraph where edges have types (relations). For example, Freebase contains entities (John_Lennon, Beatles, UK) connected by relations (member_of, born_in, from_country).
Standard GCN cannot handle this — it uses a single aggregation weight and cannot distinguish “member_of” from “born_in” edges.
The R-GCN Update Rule
Where:
- N_r(v) = neighbours of v connected via relation r
c_{v,r} = normalisation constant (typically N_r(v) ) - W^{(k)}_0 = self-loop weight (own representation)
- W^{(k)}_r = relation-specific weight matrix
Interpretation: for each relation r, node v collects messages from all neighbours connected by r, transforms them by W_r, and sums. All relation-specific sums are then added together with the self-loop.
This is equivalent to running a separate GCN on each relation’s adjacency subgraph and summing the results.
The Parameter Problem: Basis Decomposition
| For a KG with | R | = 100 relations and embedding dimension d = 200, the weight matrices have 100 × 200 × 200 = 4 million parameters — just for one layer. This is prone to overfitting, especially when many relations have few training examples. |
Basis decomposition (the R-GCN solution): express each W_r as a linear combination of B shared basis matrices:
| Where V_b ∈ ℝ^{d×d} are shared bases and a_{rb} are relation-specific scalars. This reduces parameters from | R | ×d² to B×d² + | R | ×B — much smaller when B « | R | . |
Block-diagonal decomposition (alternative): partition d into blocks, and each relation’s weight is block-diagonal. This reduces computation without sharing bases.
R-GCN for Entity Classification
Task: given a KG with some labelled entities, predict labels for unlabelled entities.
Example: Freebase entity type classification — is this entity a Person, Organisation, or Location?
Setup:
- Entity features: one-hot or learned embeddings
- 2-layer R-GCN
- Final h^{(K)}_v fed to softmax classifier
- Trained with cross-entropy on labelled entities
On the AIFB and MUTAG entity classification benchmarks, R-GCN outperforms hand-crafted KG embedding methods.
R-GCN for Link Prediction
Task: predict missing triples (subject, relation, object) — i.e., “does this relation exist between these two entities?”
Setup: R-GCN as an encoder, DistMult as a decoder.
- Encoder: run R-GCN to get entity embeddings e_s, e_o
- Decoder (DistMult): score the triple (s, r, o):
- Training: binary cross-entropy with negative sampling
This encoder-decoder split (GNN encodes structure, shallow decoder scores triples) is a common pattern for KG link prediction.
R-GCN vs DistMult / TransE
Before R-GCN, KG link prediction used shallow embedding methods:
- TransE: entity embeddings s + r ≈ o
- DistMult: e_s · W_r · e_o scoring
- ComplEx: complex-valued extensions
These encode individual entity/relation embeddings but ignore graph structure. R-GCN adds structural context: entity embeddings are informed by their graph neighbourhood, not just their own learned vectors.
Summary
| Property | R-GCN |
|---|---|
| Handles typed edges | Yes — separate W_r per relation |
| Handles typed nodes | Partial — via features; no type-specific architecture |
| Scales to many relations | Via basis decomposition |
| Tasks | Entity classification, link prediction |
| Key hyperparameter | Number of bases B (usually 2–10) |
| Limitation | Does not use attention; all neighbours equally weighted |
R-GCN is the foundational model for applying GNNs to knowledge graphs. It introduced the relation-specific weight matrix pattern that nearly all subsequent heterogeneous GNN architectures inherit.
References
- Schlichtkrull, M., Kipf, T. N., Bloem, P., van den Berg, R., Titov, I., & Welling, M. (2018). Modeling Relational Data with Graph Convolutional Networks. ESWC 2018.
- Kipf, T. N., & Welling, M. (2016). Variational Graph Auto-Encoders. arXiv preprint (graph autoencoder for link prediction).
- Yang, B., Yih, W.-T., He, X., Gao, J., & Deng, L. (2015). Embedding Entities and Relations for Learning and Inference in Knowledge Bases. ICLR 2015 (DistMult decoder).
