Random Walk Positional Encodings
Published:
The Idea
The random walk matrix is P = D⁻¹A (row-stochastic adjacency). P^k[u,v] is the probability of reaching v from u in exactly k steps.
The diagonal entries P^k[v,v] — the probability of returning to v in k steps — encode v’s local structural role:
- Nodes in dense cliques have high return probabilities (many short cycles)
- Nodes on long paths have low return probabilities (random walks escape quickly)
- Nodes in k-cycles have elevated P^k[v,v]
Why RWPE Works as a Structural Encoding
Each entry P^k[v,v] measures “how often does a k-step random walk starting from v return to v?” This is directly related to:
- Cycle membership: v in a triangle → P³[v,v] > 0
- Clique membership: v in a clique-k → high P^k[v,v]
- Degree: high-degree nodes are returned to more often
RWPE is a structural encoding: it encodes the local structural role of v. Two nodes with identical local structure get the same RWPE — intended behaviour. Nodes with different cycle membership, degree patterns, or connectivity get different PEs.
Comparison with LapPE
| Property | LapPE | RWPE | ||||
|---|---|---|---|---|---|---|
| Type | Positional (global position) | Structural (local role) | ||||
| Sign ambiguity | Yes — requires fix | No (probabilities are positive) | ||||
| Computation | O(k· | E | ·T) eigenvector | O(K· | E | ) sparse power iteration |
| Captures | Global graph geometry | Local structural patterns | ||||
| Distinguishes | Globally unique positions | Structurally unique roles | ||||
| Used by | SAN, GPS | GPS, CRaWL, GRIT |
Computing RWPE Efficiently
You only need the diagonal of P^k. Using the recursion P^k = P · P^{k-1}, you can compute the full sequence iteratively:
P = D_inv @ A # row-stochastic: P[v,:] sums to 1
Pk = torch.eye(N) # P^0 = I
rwpe = []
for k in range(1, K+1):
Pk = Pk @ P
rwpe.append(Pk.diagonal()) # landing probs at step k
pe = torch.stack(rwpe, dim=-1) # [N, K]
| Cost: K sparse matrix multiplications, each O( | E | ). Total: O(K · | E | ). Much cheaper than eigendecomposition for large K. |
RWPE in Practice
GPS (Rampášek et al., 2022) found RWPE competitive with LapPE and often preferable:
- No sign ambiguity → more stable training
- Cheaper to compute → scales better
- Captures structural roles (cycles, cliques) rather than global positions
For tasks where structural role matters more than global position (molecular property prediction, graph classification), RWPE often outperforms LapPE.
Summary
RWPE is the practical, sign-free alternative to LapPE. It encodes local structural fingerprints — cycle membership, clique participation, escape probabilities — through simple sparse power iterations. When global position matters, use LapPE; when local structural role matters, use RWPE; when both matter, use both (GPS does this).
References
- Dwivedi, V. P., Lim, A. T., Beaini, D., & Lió, P. (2021). Graph Neural Networks with Learnable Structural and Positional Representations. ICLR 2022.
- Rampasek, L., Galkin, M., Dwivedi, V. P., Lim, A. T., Wolf, G., & Beaini, D. (2022). Recipe for a General, Powerful, Scalable Graph Transformer. NeurIPS 2022 (GPS — uses RWPE as default positional encoding).
