Equivariance: What It Means and Why It Matters

4 minute read

Published:

TL;DR: A function f is G-equivariant if f(g · x) = g · f(x) for all transformations g in group G. Invariance is the special case where f(g · x) = f(x). Geometric deep learning builds equivariance into model architecture by design — this is more sample-efficient than learning it from augmented data.

Groups and Symmetry

A group G is a set of transformations {g} with a composition rule, identity, and inverses. Symmetry groups relevant to 3D geometry:

  • SE(3): rotations + translations in 3D (rigid body motions). SE = Special Euclidean.
  • E(3): rotations + translations + reflections. E = Euclidean.
  • E(n): rotations + translations + reflections in n-dimensional space.
  • SO(3): rotations only (no reflections, no translations).

For molecular tasks: SE(3) or E(3) are the relevant groups.

Invariance vs Equivariance

Let ρ_in and ρ_out be the representations of G on the input and output spaces respectively (i.e., how transformations act on inputs/outputs).

G-invariant: f(ρ_in(g) · x) = f(x). Output does not change when input is transformed.

f(R · x) = f(x) for all rotations R ∈ SO(3)

Example: molecular potential energy. Rotating the molecule doesn’t change its energy.

G-equivariant: f(ρ_in(g) · x) = ρ_out(g) · f(x). Output transforms consistently with input.

f(R · x) = R · f(x) for all rotations R ∈ SO(3)

Example: atomic forces. If we rotate the molecule, the forces rotate the same way.

Note: invariance is a special case of equivariance where ρ_out is the trivial representation (all g map to the identity).

Why Equivariance Is Better Than Augmentation

Data augmentation approach: train on random rotations of the molecule, hoping the model learns rotational invariance from data.

Problems:

  1. Requires many rotations per sample → expensive
  2. The model might learn approximate invariance, not exact invariance
  3. Generalisation to unseen orientations is not guaranteed

Equivariant approach: build the constraint into the architecture. The model is exactly equivariant by design — for any input orientation, the output transforms correctly. No augmentation needed.

Practical advantage: equivariant models achieve the same accuracy with ~10× fewer training samples than augmentation-based approaches on molecular benchmarks.

The CNN analogy: A CNN is equivariant to translations — shifting the image shifts the feature maps by the same amount. This is baked into the convolution operation (shared weights + sliding window). We don't augment with all possible image shifts; instead, the architecture encodes translation equivariance. Geometric GNNs do the same for rotations and reflections.

Representations: Scalars, Vectors, Tensors

The representation ρ_out determines how the output transforms:

Scalar (l=0 / invariant): a single number. Energy, charge, mass. Unchanged by rotation: ρ(R) = 1.

Vector (l=1 / equivariant): a 3D vector. Forces, velocities, dipole moment. Rotates with the molecule: ρ(R) = R.

Rank-2 tensor: a 3×3 matrix. Stress tensor, polarisability. Transforms as ρ(R) = R ⊗ R.

Irreducible representations (irreps) of SO(3): characterised by degree l. l=0 is scalar, l=1 is vector, l=2 is rank-2 tensor, etc. Higher l captures finer geometric information at increasing computational cost.

Types of Equivariant Models

Type 1: Distance-based invariance Features: only interatomic distances and angles. Output: scalar only. Architectures: SchNet, DimeNet. Limitation: cannot output vectors (forces require equivariant outputs).

Type 2: Vector-based equivariance (E(3)/SE(3)) Features: positions as vectors, combined with scalar features. Output: scalars + vectors. Architectures: EGNN, PaiNN, NequIP.

Type 3: Tensor field networks (full irreps) Features: spherical harmonics up to degree L. Output: arbitrary tensor fields. Architectures: TFN, SE(3)-Transformers, MACE. Limitation: expensive, O(L²) or O(L³) in degree.

Building Equivariant Layers

Any layer that combines inputs through:

  1. Equivariant linear maps (apply R consistently to all vectors)
  2. Invariant scalars (distances, norms)
  3. Tensor products (combining irreps)

is equivariant. The key constraint: never mix coordinates directly with scalars through arbitrary MLPs — that would break equivariance.

Summary

ConceptDefinitionExample
Invariantf(Rx) = f(x)Potential energy
Equivariantf(Rx) = R f(x)Forces
AugmentationLearn symmetry from dataExpensive, approximate
Architectural equivarianceBaked-in symmetryExact, sample-efficient
Scalar (l=0)Unchanged by rotationEnergy, charge
Vector (l=1)Rotates with moleculeForce, velocity

Equivariance is the mathematical foundation of geometric deep learning. Every architecture in the next posts — EGNN, SE(3)-Transformers, TFN — is a concrete instantiation of these principles.

References