Random Walk Positional Encodings

3 minute read

Published:

TL;DR: RWPE encodes node v as a vector [RW^1[v,v], RW^2[v,v], ..., RW^k[v,v]] — the probability of landing back on v after 1, 2, ..., k random walk steps. This captures local structural roles (cycle membership, connectivity patterns) without sign ambiguity or expensive eigendecomposition.

The Idea

The random walk matrix is P = D⁻¹A (row-stochastic adjacency). P^k[u,v] is the probability of reaching v from u in exactly k steps.

The diagonal entries P^k[v,v] — the probability of returning to v in k steps — encode v’s local structural role:

  • Nodes in dense cliques have high return probabilities (many short cycles)
  • Nodes on long paths have low return probabilities (random walks escape quickly)
  • Nodes in k-cycles have elevated P^k[v,v]
pe_v = [P^1[v,v], P^2[v,v], ..., P^K[v,v]] ∈ ℝᴷ

Why RWPE Works as a Structural Encoding

Each entry P^k[v,v] measures “how often does a k-step random walk starting from v return to v?” This is directly related to:

  • Cycle membership: v in a triangle → P³[v,v] > 0
  • Clique membership: v in a clique-k → high P^k[v,v]
  • Degree: high-degree nodes are returned to more often

RWPE is a structural encoding: it encodes the local structural role of v. Two nodes with identical local structure get the same RWPE — intended behaviour. Nodes with different cycle membership, degree patterns, or connectivity get different PEs.

Comparison with LapPE

PropertyLapPERWPE    
TypePositional (global position)Structural (local role)    
Sign ambiguityYes — requires fixNo (probabilities are positive)    
ComputationO(k·E·T) eigenvectorO(K·E) sparse power iteration
CapturesGlobal graph geometryLocal structural patterns    
DistinguishesGlobally unique positionsStructurally unique roles    
Used bySAN, GPSGPS, CRaWL, GRIT    

Computing RWPE Efficiently

pe_v[k] = (D⁻¹A)^k [v,v] = e_vᵀ P^k e_v

You only need the diagonal of P^k. Using the recursion P^k = P · P^{k-1}, you can compute the full sequence iteratively:

P = D_inv @ A          # row-stochastic: P[v,:] sums to 1
Pk = torch.eye(N)      # P^0 = I
rwpe = []
for k in range(1, K+1):
    Pk = Pk @ P
    rwpe.append(Pk.diagonal())  # landing probs at step k
pe = torch.stack(rwpe, dim=-1)  # [N, K]
Cost: K sparse matrix multiplications, each O(E). Total: O(K ·E). Much cheaper than eigendecomposition for large K.

RWPE in Practice

GPS (Rampášek et al., 2022) found RWPE competitive with LapPE and often preferable:

  • No sign ambiguity → more stable training
  • Cheaper to compute → scales better
  • Captures structural roles (cycles, cliques) rather than global positions

For tasks where structural role matters more than global position (molecular property prediction, graph classification), RWPE often outperforms LapPE.

Summary

RWPE is the practical, sign-free alternative to LapPE. It encodes local structural fingerprints — cycle membership, clique participation, escape probabilities — through simple sparse power iterations. When global position matters, use LapPE; when local structural role matters, use RWPE; when both matter, use both (GPS does this).

References