Witness Complexes: Sparse Topology at Scale

2 minute read

Published: September 07, 2025

TL;DR: The witness complex (de Silva & Carlsson, 2004) uses a small set of landmarks L ⊂ P and lets all other data points "witness" the inclusion of simplices between landmarks. A simplex on landmarks is included if it has a witness — a data point closer to those landmarks than to any others. This keeps the complex size O(|L|^k) rather than O(|P|^k), enabling TDA on millions of points.

Motivation: Scalability of Vietoris-Rips

The Vietoris-Rips complex on $n$ points has up to $O(n^k)$ simplices of dimension $k$. For $n = 10^5$ and $k = 2$, this is $10^{10}$ triangles — computationally infeasible. We need a way to capture the same topology with fewer simplices.

The key insight: most data points carry redundant topological information. If we pick a small representative set of landmarks $L \subset P$ ($$

\ll

$), we can define a complex on$L$that captures the topology of$P$$ by using the remaining points as witnesses.

Lazy Witness Complex

Choose landmarks $L = \{l_1, \ldots, l_m\} \subset P$. For a data point $z \in P$, let $m_k(z)$ be its $(k+1)$-th nearest landmark distance (the distance to the $(k+1)$-th closest landmark, counting from 0).

The lazy witness complex $W(P, L, \nu)$ at parameter $\nu \geq 0$ includes a simplex $\sigma = [l_{i_0}, \ldots, l_{i_k}]$ (a set of landmarks) if there exists a witness $z \in P$ such that:

$$\max_{j} d(z, l_{i_j}) \leq \nu + m_k(z)$$

As $\nu$ increases from 0 to ∞, we get the witness filtration, whose persistent homology approximates the topology of $P$.

Landmark Selection

The choice of landmarks affects the quality of the approximation. Common strategies:

Random selection: simple, works for large $n$.
Maxmin sequential: iteratively select the point farthest from all currently selected landmarks — provides a $\varepsilon$-net guarantee.
k-means centroids: landmarks represent cluster centres.

The number of landmarks $m$ typically ranges from $\sqrt{n}$ to $n/10$. With $m = 300$ landmarks, a complex with millions of points becomes tractable.

Key Insight: The witness complex is not guaranteed to be homotopy equivalent to $$P$$ in general, but under mild conditions (when the data is well-sampled relative to its reach), the persistent homology of the witness filtration approximates that of the Čech filtration. For large-scale TDA in practice, witness complexes are the primary tool when Vietoris-Rips is too large.

References

V. de Silva & G. Carlsson, “Topological Estimation Using Witness Complexes,” SPBG 2004. link.
G. Carlsson, “Topology and Data,” Bulletin of the AMS, 2009.

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Alessio Borgi

Witness Complexes: Sparse Topology at Scale

Motivation: Scalability of Vietoris-Rips

Lazy Witness Complex

Landmark Selection

References

Share on

You May Also Enjoy

GAPE: Remember to Forget — Gated Adaptive Positional Encoding

PolyNSD: Polynomial Neural Sheaf Diffusion

TDA in Materials Science: Topology of Structure and Phase

TDA in Drug Discovery: Molecular Topology