CurveSimilarities documentation#

A collection of curve similarity measures.

There are tons of similar packages published on PyPI, but this one aims to be the most NumPy-friendly and the easiest to use.

API reference#

To reproduce the examples, run the following code first:

import numpy as np
from curvesimilarities import *

Warning

The similarity functions do not sanitize the vertices of input polygonal curves. Make sure your curves do not have consecutive duplicate vertices.

Fréchet distance#

Continuous and discrete Fréchet distances.

curvesimilarities.frechet.fd(P, Q, *, rel_tol=0.0, abs_tol=2.220446049250313e-16)[source]#

(Continuous) Fréchet distance between two open polygonal curves.

Let \(f: [0, 1] \to \Omega\) and \(g: [0, 1] \to \Omega\) be curves where \(\Omega\) is a metric space. The Fréchet distance between \(f\) and \(g\) is defined as

\[\inf_{\alpha, \beta} \max_{t \in [0, 1]} \lVert f(\alpha(t)) - g(\beta(t)) \rVert,\]

where \(\alpha, \beta: [0, 1] \to [0, 1]\) are continuous non-decreasing surjections and \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

rel_tol, abs_toldouble

Relative and absolute tolerances for parametric search of the Fréchet distance. The search is terminated if the upper boundary a and the lower boundary b satisfy: a - b <= max(rel_tol * a, abs_tol).

Returns:
distdouble

The (continuous) Fréchet distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Notes

This function implements Alt and Godau’s algorithm [1].

References

Examples

>>> fd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]])
1.0...
curvesimilarities.frechet.dfd(P, Q)[source]#

Discrete Fréchet distance between two open polygonal curves.

Let \(\{P_0, P_1, ..., P_n\}\) and \(\{Q_0, Q_1, ..., Q_m\}\) be polyline vertices in metric space. The discrete Fréchet distance between two polylines is defined as

\[\min_{C} \max_{(i, j) \in C} \lVert P_i - Q_j \rVert,\]

where \(C\) is a nondecreasing coupling over \(\{0, ..., n\} \times \{0, ..., m\}\), starting from \((0, 0)\) and ending with \((n, m)\). \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

An \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

An \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The discrete Fréchet distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Notes

This function implements Eiter and Mannila’s algorithm [2].

References

Examples

>>> dfd([[0, 0], [1, 1], [2, 0]], [[0, 1], [2, -4]])
4.0

Dynamic time warping distance#

Dynamic time warping distance.

This module implements only the basic algorithm. If you need advanced features, use dedicated packages such as dtw-python.

curvesimilarities.dtw.dtw(P, Q)[source]#

Dynamic time warping distance.

Let \(\{P_0, P_1, ..., P_n\}\) and \(\{Q_0, Q_1, ..., Q_m\}\) be polyline vertices in metric space. The dynamic time warping distance between two polylines is defined as

\[\min_{C} \sum_{(i, j) \in C} \lVert P_i - Q_j \rVert,\]

where \(C\) is a nondecreasing coupling over \(\{0, ..., n\} \times \{0, ..., m\}\), starting from \((0, 0)\) and ending with \((n, m)\). \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The dynamic time warping distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

dtw_owp

Dynamic time warping distance with optimal warping path.

Notes

This function implements the algorithm described by Senin [3].

References

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dtw(P, Q)
20.0...
curvesimilarities.dtw.dtw_owp(P, Q)[source]#

Dynamic time warping distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The dynamic time warping distance between P and Q, NaN if any vertice is empty.

owpndarray

Indices of P and Q for optimal warping path, empty if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dist, path = dtw_owp(P, Q)
>>> dist / len(path)  # averaged dynamic time warping
1.00...
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T, "x")  
_images/index-1.png
curvesimilarities.dtw.sdtw(P, Q)[source]#

Squared dynamic time warping distance.

The squared dynamic time warping distance is defined as

\[\min_{C} \sum_{(i, j) \in C} \lVert P_i - Q_j \rVert^2,\]

with symbols explained in dtw().

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The squared dynamic time warping distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

dtw

Dynamic time warping distance.

sdtw_owp

Squared dynamic time warping distance with optimal warping path.

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> sdtw(P, Q)
20.0...
curvesimilarities.dtw.sdtw_owp(P, Q)[source]#

Squared dynamic time warping distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The squared dynamic time warping distance between P and Q, NaN if any vertice is empty.

owpndarray

Indices of P and Q for optimal warping path, empty if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dist, path = sdtw_owp(P, Q)
>>> (dist / len(path))**0.5  # quadratic mean dynamic time warping
1.00...
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T, "x")  
_images/index-2.png

Integral Fréchet distance#

Integral Fréchet distance.

curvesimilarities.integfrechet.ifd(P, Q, delta)[source]#

Integral Fréchet distance between two open polygonal curves.

Let \(f, g: [0, 1] \to \Omega\) be curves defined in a metric space \(\Omega\). Let \(\alpha, \beta: [0, 1] \to [0, 1]\) be continuous non-decreasing surjections, and define \(\pi: [0, 1] \to [0, 1] \times [0, 1]\) such that \(\pi(t) = \left(\alpha(t), \beta(t)\right)\). The integral Fréchet distance between \(f\) and \(g\) is defined as

\[\inf_{\pi} \int_0^1 \delta\left(\pi(t)\right) \cdot \lVert \pi'(t) \rVert \mathrm{d}t,\]

where \(\delta\left(\pi(t)\right)\) is a distance between \(f\left(\alpha(t)\right)\) and \(g\left(\beta(t)\right)\) in \(\Omega\) and \(\lVert \cdot \rVert\) is a norm in \([0, 1] \times [0, 1]\). In this implementation, we use the Euclidean distance for \(\delta\) and the Manhattan norm for \(\lVert \cdot \rVert\).

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The integral Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

ifd_owp

Integral Fréchet distance with optimal warping path.

Notes

This function implements the algorithm of Brankovic et al [4].

References

Examples

>>> ifd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]], 0.1)
2.0
curvesimilarities.integfrechet.ifd_owp(P, Q, delta)[source]#

Integral Fréchet distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The integral Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

owpndarray

Optimal warping path, empty if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> dist, path = ifd_owp([[0, 0], [0.5, 0], [1, 0]], [[0.5, 1], [1.5, 1]], 0.1)
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T)  
_images/index-3.png

Average Fréchet distance#

Average Fréchet distance and its variants.

curvesimilarities.averagefrechet.afd(P, Q, delta)[source]#

Average Fréchet distance between two open polygonal curves.

The average Fréchet distance is defined as [5]

\[\inf_{\pi} \frac{ \int_0^1 \delta\left(\pi(t)\right) \cdot \lVert \pi'(t) \rVert \mathrm{d}t }{ \int_0^1 \lVert \pi'(t) \rVert \mathrm{d}t },\]

with symbols explained in ifd(). We use the Manhattan norm for \(\lVert \cdot \rVert\), so the formula can be reduced to

\[\frac{1}{\lvert f \rvert + \lvert g \rvert} \inf_{\pi} \int_0^1 \delta\left(\pi(t)\right) \cdot \lVert \pi'(t) \rVert \mathrm{d}t,\]

where \(\lvert \cdot \rvert\) denotes the length of a polygonal curve.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The average Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

ifd

Integral Fréchet distance.

afd_owp

Average Fréchet distance with optimal warping path.

Notes

Using this function is marginally faster than calling ifd() and then dividing.

References

Examples

>>> afd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]], 0.1)
1.0
curvesimilarities.averagefrechet.afd_owp(P, Q, delta)[source]#

Average Fréchet distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The average Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

owpndarray

Optimal warping path, empty if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> dist, path = afd_owp([[0, 0], [0.5, 0], [1, 0]], [[0.5, 1], [1.5, 1]], 0.1)
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T)  
_images/index-4.png
curvesimilarities.averagefrechet.qafd(P, Q, delta)[source]#

Quadratic average Fréchet distance between two open polygonal curves.

The quadratic average Fréchet distance is defined as

\[\inf_{\pi} \sqrt{ \frac{ \int_0^1 \delta\left(\pi(t)\right)^2 \cdot \lVert \pi'(t) \rVert \mathrm{d}t }{ \int_0^1 \lVert \pi'(t) \rVert \mathrm{d}t } },\]

with symbols explained in ifd(). We use the Manhattan norm for \(\lVert \cdot \rVert\), so the formula can be reduced to

\[\frac{1}{\sqrt{\lvert f \rvert + \lvert g \rvert}} \inf_{\pi} \sqrt{ \int_0^1 \delta\left(\pi(t)\right)^2 \cdot \lVert \pi'(t) \rVert \mathrm{d}t },\]

where \(\lvert \cdot \rvert\) denotes the length of a polygonal curve.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The quadratic average Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

afd

Average Fréchet distance.

qafd_owp

Quadratic average Fréchet distance with optimal warping path.

Examples

>>> qafd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]], 0.1)
1.0
curvesimilarities.averagefrechet.qafd_owp(P, Q, delta)[source]#

Quadratic average Fréchet distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The quadratic average Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

owpndarray

Optimal warping path, empty if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> dist, path = qafd_owp([[0, 0], [0.5, 0], [1, 0]], [[0.5, 1], [1.5, 1]], 0.1)
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T)  
_images/index-5.png