Skip to content

On Filtrations

Mikael Vejdemo-Johansson edited this page Nov 7, 2022 · 2 revisions

Filtrations are on of the aspects of the development of Persistent Homology that have found a sequence of different options - both for algebraic abstractions and for algorithmic implementations. On this page we'll try to give a partial overview of the historical landscape.

The Origins

Filtration as a concept is quite a bit more entrenched in abstract algebra and algebraic geometry, where you would study a graded ring and a graded module, with compatible gradings, and be interested in how these gradings split up an object. One useful example to keep in mind is the polynomial ring in some number of variables, graded by total degree. So the degree 0 part would be all constants from the underlying ring (or field), degree 1 would be pure linear forms (without constant prat), degree 2 would be pure square forms etc.

It can be quite useful here to study "all degrees up to and including the one I'm interested in", so that "degree" 0 would contain just constants, "degree" 1 would contain constants and linear terms - possibly mixed, "degree" 2 would contain possibly mixed square polynomials of all kinds, etc.

In geometric terms, the case where you isolate pure degrees would be closely connected to projective algebraic geometry, and the case where you allowed mixed terms connects to affine algebraic geometry.

The notions show up naturally in topology as well, where you can, for instance, define a cell complex by defining degree $k$ as the part that contains only cells of dimension $\leq k$ (the $k$-skeleton) and go up in dimension by adding $k+1$-dimensional cells wherever they need to be attached to the rest of the skeleton.

In each of these contexts, the notion of a filtered object emerges, where a filtered object (ie filtered ring, filtered module, filtered cell complex, ...) is a increasing chain of sub-objects. So a filtered module is a decomposition of a module $M$ into a sequence of submodules, each contained in the next, $M_0 \subseteq M_1 \subseteq \dots \subseteq M$.

Enter persistent homology

This was the setting many researcher had on their minds in the early 2000s, when Edelsbrunner-Letscher-Zomorodian[^1], and later Zomorodian-Carlsson[^2] first created the persistence algorithm, and then later gave it an algebraic interpretation and foundation. The original persistence algorithm in Edelsbrunner-Letscher-Zomorodian[^1] assumes simplices have a total order, that they call the "filter" - and the simplices appear to the algorithm in this order. Taking "all simplices that appeared up until time $t$" for each subset of simplices then (with mild assumptions on the total order) produces a filtered simplicial complex in the traditional sense.

In Zomorodian-Carlsson[^2], the manipulations inherent in the persistence algorithm were given an algebraic interpretation: the filter got encoded in degrees of a graded module structure for graded modules over the polynomial ring $k[t]$ in one variable. This means, spelling it out, that at each filtration step, the chain complex (and boundaries and cycles and homology) is a vector space, and the action of "multiplying" a vector in this vector space by some power $t^d$ corresponds to carrying the corresponding chain forward into the complex where $d$ additional simplices have been added from the filter sequence.

This formulation, using graded $k[t]$-modules is particularly useful for theoretical justifications: $k[t]$ is a PID (principal ideal domain), and the (not necessarily obvious) existence of barcodes representing persistent homology emerges as an immediate consequence of the classical undergraduate abstract algebra theorem classifying and decomposing all finitely generated modules over PIDs.

A fundamental drawback, however, is that the typical algorithms for generating filtered simplicial complexes do not tend to work in terms of powers of a discrete "do something to the system" operation that we could call $t$ - but work in terms of some continuous notion of distance that goes through events where the underlying encoded topology changes. The Zomorodian-Carlsson approach is to pick a starting value $\epsilon_0$ and a step size $\delta$ and work with values $t^d \sim \epsilon_0 + d\delta$ that the filtration moves through. Alternatively you can simply instruct the user (or proof) to pick values for the distance thresholds spread out with one value between each pair of event transition times, and use those. In each case, the action of $t$ moves the construction forward from one $\epsilon$-value to the next.

Example: It is probably good at this point to illustrate these notions with a worked out complete example. Take the following points, and their Vietoris-Rips complex:

x y
0.98918612 -0.01292867
0.79879243 0.96093161
0.31920789 0.59161701
-0.33133186 -0.59300152
-0.82346418 -0.9510965
-1.00378432 -0.01407998
-0.81023642 0.94364748
-0.31922949 0.56465365
0.32364409 -0.57126628
0.80481295 -0.95461111

Going up to a Vietoris-Rips filtration value of 1.0, we get the following simplices occurring, with degrees computed either with one value between each transition point, or with $\epsilon_0 = 0$ and $\delta=0.01$:

simplex filtration value degree (method 1) degree (method 2)
$[0]$ 0 0 0
$[1]$ 0 0 0
$[2]$ 0 0 0
$[3]$ 0 0 0
$[4]$ 0 0 0
$[5]$ 0 0 0
$[6]$ 0 0 0
$[7]$ 0 0 0
$[8]$ 0 0 0
$[9]$ 0 0 0
$[1,2]$ 0.605305 1 60
$[3,4]$ 0.608627 2 60
$[8,9]$ 0.615205 3 61
$[6,7]$ 0.620261 4 62
$[2,7]$ 0.639006 5 63
$[3,8]$ 0.655336 6 65
$[0,8]$ 0.868727 7 86
$[3,5]$ 0.887323 8 88
$[5,7]$ 0.896408 9 89
$[0,2]$ 0.902411 10 90
$[4,5]$ 0.954209 11 95
$[3,4,5]$ 0.954209 12 95
$[0,9]$ 0.959562 13 95
$[0,8,9]$ 0.959562 14 95
$[5,6]$ 0.977089 15 97
$[5,6,7]$ 0.977089 16 97
$[0,1]$ 0.992297 17 99
$[0,1,2]$ 0.992297 18 99

Fuzzing the line between Filtration and Parameterization

As research developed further and further, connecting to the algebra of $k[t]$ has grown somewhat out of fashion, in favor of approaches that treat $\mathbb{R}$ as a total order, and create a representation of that total order - as a module over the "order ring" $k[\mathbb{R}]$. This corresponds largely to allowing fractional values for the exponent in $k[t]$, and rejecting the notion of advancing the parameter value being fixed at going to a "next" value.

To handle this setup, various abstraction approaches have emerged - some of them rooted in sheaf theory, some of them rooted in representation theory. Common is to view the filtration value as a defining quantity, and to study the structure of the transition between specific filtration values. We still get a total order of simplices by consuming them in order of the filtration value, but we no longer tie ourselves into a particular discretization of $\mathbb{R}$ to work with.

Arguably, these formulation tend to no longer really inherently be filtration in the classical sense, but rather parameterizations that can be used to create filtrations - but the terminology choice of using filtration to refer to all of these has stuck in the community.

Representation in Data Structures

To actually represent these filtration steps and use them in algorithms, a variety of approaches have been in use.

JPlex

Maintain parallel arrays / streams: one with simplices, and one with filtration values. Each simplex carefully tracks it's filtration index findex through all computations, so that at the end the corresponding filtration value can be looked up and substituted into the persistence barcode endpoint values.

## JavaPlex

Create a parallel FiltrationConverter instance that handles the conversion - this can either be lazy (computing the filtrationvalue on request) or maintain an array of filtration values. Standard choice uses the $\epsilon_0 + k\delta$ approach above.

Dionysus

Each simplex has a field for carrying around arbitrary data, where the filtration value is deposited.

## Ripser

Recomputes on demand so as not to have to carry the data around any more than absolutely necessary.

References

[^1]: H Edelsbrunner, D Letscher and A Zomorodian, "Topological Persistence and Simplification", Proceedings of the 41st Annual Symposium on Foundations of Computer Science, 2000. [^2]: A Zomorodian, G Carlsson, "Computing persistent homology", Proceedings of the twentieth annual symposium on Computational geometry 2004 Jun 8 (pp. 347-356).