Density-based Riemannian metrics and Persistent Homology

Density-based Riemannian metrics &
Persistent Homology

XIMENA FERNANDEZ

Durham University

Dagstuhl Seminar

Fernandez, Borghini, Mindlin and Groisman. 'Intrinsic persistent homology via density-based metric learning.' JMLR (2023)

Motivation

Homology inference

Metric space: $(\mathbb X_n, d_E)\sim (\mathcal M, d_E)$

$\bullet ~ ~\mathrm{Rips}_\epsilon(\mathcal{M}, d_E)\simeq \mathcal{M}$ for $\epsilon < 2 \sqrt{\frac{D+1}{2D}}\mathrm{rch}(\mathcal{M})~~$ (Kim, Shin, Chazal, Rinaldo & Wasserman, 2020)

Homology inference

Metric space: $(\mathbb X_n, d_{kNN})\sim (\mathcal M, d_\mathcal{M})~~~$
(Bernstein, De Silva, Langford & Tenenbaum, 2000)

$\bullet ~ ~\mathrm{Rips}_\epsilon(\mathcal{M}, d_\mathcal{M})\simeq \mathcal{M}$ for $\epsilon < \mathrm{conv}(\mathcal{M}, d_{\mathcal{M}})~~$ (Hausmann, 1995; Latschev, 2001)

Homology inference

Metric space: $(\mathbb X_n, d_{kNN})\sim (\mathcal M, d_\mathcal{M})$

Homology inference

Metric space: $(\mathbb X_n, d_{kNN})\sim (\mathcal M, d_\mathcal{M})$

Homology inference

Metric space: $(\mathbb X_n, d_{kNN})\sim (\mathcal M, d_\mathcal{M})$

Density-based metric learning

Fermat distance

(Mckenzie & Damelin, 2019) (Groisman, Jonckheere & Sapienza, 2022)

Let $\mathbb{X}_n = \{x_1,...,x_n\}\subseteq \mathbb{R}^D$ be a finite sample.

For $p> 1$, the Fermat distance between $x,y\in \mathbb{R}^D$ is defined by \[ d_{\mathbb{X}_n, p}(x,y) = \inf_{\gamma} \sum_{i=0}^{r}|x_{i+1}-x_i|^{p} \] over all paths $\gamma=(x_0, \dots, x_{r+1})$ of finite length with $x_0=x$, $x_{r+1} = y$ and $\{x_1, x_2, \dots, x_{r}\}\subseteq \mathbb{X}_n$.

Fermat distance

Fermat distance

Fermat distance

Density-based geometry

(Hwang, Damelin & Hero, 2016)

Let $\mathcal M \subseteq \mathbb{R}^D$ be a manifold and let $f\colon\mathcal{M}\to \mathbb{R}_{>0}$ be a smooth density.

For $q>0$, the deformed Riemannian distance* in $\mathcal{M}$ is \[d_{f,q}(x,y) = \inf_{\gamma} \int_{I}\frac{1}{f(\gamma_t)^{q}}||\dot{\gamma}_t|| dt \] over all $\gamma:I\to \mathcal{M}$ with $\gamma(0) = x$ and $\gamma(1)=y$.

* Here, if $g$ is the inherited Riemannian tensor, then $d_{f,q}$ is the Riemannian distance induced by $g_q= f^{-2q} g$.

Convergence results

(F., Borghini, Mindlin & Groisman, 2023)

Let $\mathcal{M}$ be a closed smooth $d$-dimensional manifold embedded in $\mathbb{R}^D$. Let $\mathbb{X}_n$ be a sample of $\mathcal M$ sampled according to a density $f\colon \mathcal M\to \mathbb R$.

\[\big(\mathbb{X}_n, C(n,p,d) d_{\mathbb{X}_n,p}\big)\xrightarrow[n\to \infty]{GH}\big(\mathcal{M}, d_{f,q}\big) ~~~ \text{ for } q = (p-1)/d\]

Theorem (F., Borghini, Mindlin, Groisman, 2023)

Given $p>1$ and $q=(p-1)/d$, there exists a constant $\mu = \mu(p,d)$ such that for every $\lambda \in \big((p-1)/pd, 1/d\big)$ and $\varepsilon>0$ there exist $\theta>0$ satisfying \[ \mathbb{P}\left( d_{GH}\left(\big(\mathcal{M}, d_{f,q}\big), \big(\mathbb{X}_n, {\scriptstyle \frac{n^{q}}{\mu}} d_{\mathbb{X}_n, p}\big)\right) > \varepsilon \right) \leq \exp{\left(-\theta n^{(1 - \lambda d) /(d+2p)}\right)} \] for $n$ large enough.

Convergence results

(F., Borghini, Mindlin & Groisman, 2023)

\[\mathrm{dgm}(\mathrm{Filt}(\mathbb{X}_n, {C(n,p,d)} d_{\mathbb{X}_n,p}))\xrightarrow[n\to \infty]{B}\mathrm{dgm}(\mathrm{Filt}(\mathcal{M}, d_{f,q})) ~~~ \text{ for } q = (p-1)/d\]

Theorem (F., Borghini, Mindlin, Groisman, 2023)

Given $p>1$ and $q=(p-1)/d$, there exists a constant $\mu = \mu(p,d)$ such that for every $\lambda \in \big((p-1)/pd, 1/d\big)$ and $\varepsilon>0$ there exist $\theta>0$ satisfying \[ \mathbb{P}\Big( d_B\big(\mathrm{dgm}(\mathrm{Filt}(\mathcal{M}, d_{f,q})),\mathrm{dgm}(\mathrm{Filt}(\mathbb{X}_n, {\scriptstyle \frac{n^{q}}{\mu}} d_{\mathbb{X}_n,p}))\big)>\varepsilon\Big)\\\leq \exp{\big(-\theta n^{(1 - \lambda d)/(d+2p)}\big)}\] for $n$ large enough.

Fermat-based persistence diagrams

Intrinsic reconstruction

$\bullet ~ ~\mathrm{Rips}_\epsilon(\mathcal{M}, d_\mathcal{f,q})\simeq \mathcal{M}$ for $\epsilon < \mathrm{conv}(\mathcal{M}, d_{f,q})$

Fermat-based persistence diagrams

Robustness to outliers

Fermat-based persistence diagrams

Robustness to outliers

Prop (F., Borghini, Mindlin, Groisman, 2023)

Let $\mathbb{X}_n$ be a sample of $\mathcal{M}$ and let $Y\subseteq \mathbb{R}^D\smallsetminus \mathcal{M}$ be a finite set of outliers.

Fermat-based persistence diagrams

Robustness to outliers

Prop (F., Borghini, Mindlin, Groisman, 2023)

Let $\mathbb{X}_n$ be a sample of $\mathcal{M}$ and let $Y\subseteq \mathbb{R}^D\smallsetminus \mathcal{M}$ be a finite set of outliers.
Let $\delta = \displaystyle \min\Big\{\min_{y\in Y} d_E(y, Y\smallsetminus \{y\}), ~d_E(\mathbb X_n, Y)\Big\}$.
Then, for all $k>0$ and $p>1$, \[ \mathrm{dgm}_k(\mathrm{Rips}_{<\delta^p}(\mathbb{X}_n \cup Y, d_{\mathbb{X}_n\cup Y, p})) = \mathrm{dgm}_k(\mathrm{Rips}_{<\delta^p}(\mathbb{X}_n, d_{\mathbb{X}_n, p})) \] where $\mathrm{Rips}_{<\delta^p}$ stands for the Rips filtration up to parameter $\delta^{p}$ and $\mathrm{dgm}_k$ for the persistent homology of deg $k$.

Fermat-based persistence diagrams

Computational implementation

Complexity:
$O(n^3)$
reducible to $O(n^2*k*\log(n))$ using the $k$-NN-graph (for $k = O(\log n)$ the geodesics belong to the $k$-NN graph with high probability).
Python library:
fermat
Tool in Giotto-TDA:
In progress
Computational experiments:
ximenafernandez/intrinsicPH

Applications to
time series analysis

Parameter selection in Takens' embeddings

Embedding dimension $D$

Parameter selection in Takens' embeddings

Time delay $T$

Topology of embeddings

Electrocardiogram

Source data: PhysioNet Database https://physionet.org/about/database/

Topology of embeddings

Electrocardiogram

Topology of embeddings

Electrocardiogram

Topology of embeddings

Electrocardiogram

* We use Fermat distance with $p=2$.

Topology of embeddings

Birdsongs

Source data: Private experiments. Laboratory of Dynamical Systems, University of Buenos Aires.

Topology of embeddings

Birdsongs

Topology of embeddings

Birdsongs

Topology of embeddings

Birdsongs

Questions

Results in the noisy case
$\widetilde{\mathbb{X}}_n = \{\widetilde {x}_1, \widetilde {x}_2, \dots, \widetilde {x}_n\} $ such that $\widetilde{x}_i = x_i + \xi_i $ with $x_i \in \mathcal M$ and $\xi_i\in \mathbb R^D$ 'noise'. \[\text{Study }~~~ \mathbb{P}\Big( d_B\big(\mathrm{dgm}(\mathrm{Filt}(\mathcal{M}, d_{f,q})),\mathrm{dgm}(\mathrm{Filt}(\mathbb{\widetilde{X}}_n, {\scriptstyle \frac{n^{q}}{\mu}} d_{\mathbb{X}_n,p}))\big)>\varepsilon\Big)\]

The parameter $p$

Study the theoretical behaviour of the distances for different values of $p$. [McKenzie, Damelin]
'Best choice' of $p$.

References

Source: X. F., E. Borghini, G. Mindlin, P. Groisman, Intrinsic persistent homology via density-based metric learning. Journal of Machine Learning Research 24 (2023) 1-42.

Github Repository: ximenafernandez/intrinsicPH

Tutorial: Intrinsic persistent homology. AATRN Youtube Channel

Density-based Riemannian metrics & Persistent Homology

Motivation

Homology inference

Homology inference

Homology inference

Homology inference

Homology inference

Density-based metric learning

Fermat distance

(Mckenzie & Damelin, 2019) (Groisman, Jonckheere & Sapienza, 2022)

Fermat distance

Fermat distance

Fermat distance

Density-based geometry

(Hwang, Damelin & Hero, 2016)

Convergence results

Convergence results

(F., Borghini, Mindlin & Groisman, 2023)

Convergence results

(F., Borghini, Mindlin & Groisman, 2023)

Fermat-based persistence diagrams

Fermat-based persistence diagrams

Fermat-based persistence diagrams

Fermat-based persistence diagrams

Intrinsic reconstruction

Fermat-based persistence diagrams

Robustness to outliers

Fermat-based persistence diagrams

Robustness to outliers

Fermat-based persistence diagrams

Robustness to outliers

Fermat-based persistence diagrams

Computational implementation

Applications to time series analysis

Parameter selection in Takens' embeddings

Parameter selection in Takens' embeddings

Topology of embeddings

Electrocardiogram

Topology of embeddings

Electrocardiogram

Topology of embeddings

Electrocardiogram

Topology of embeddings

Electrocardiogram

Topology of embeddings

Birdsongs

Topology of embeddings

Birdsongs

Topology of embeddings

Birdsongs

Topology of embeddings

Birdsongs

Questions

References

Thanks!

Density-based Riemannian metrics &
Persistent Homology

Applications to
time series analysis