CurveSimilarities documentation#

A collection of curve similarity measures.

There are tons of similar packages published on PyPI, but this one aims to be the most NumPy-friendly and the easiest to use.

API reference#

To reproduce the examples, run the following code first:

import numpy as np
from curvesimilarities import *
from curvesimilarities.util import *

Warning

The similarity functions do not sanitize the vertices of input polygonal curves. Make sure your curves do not have consecutive duplicate vertices.

Fréchet distance#

Continuous and discrete Fréchet distances.

curvesimilarities.frechet.fd(P, Q, *, rel_tol=0.0, abs_tol=2.220446049250313e-16)[source]#

(Continuous) Fréchet distance between two open polygonal curves.

Let \(f: [0, 1] \to \Omega\) and \(g: [0, 1] \to \Omega\) be curves where \(\Omega\) is a metric space. The Fréchet distance between \(f\) and \(g\) is defined as

\[\inf_{\alpha, \beta} \max_{t \in [0, 1]} \lVert f(\alpha(t)) - g(\beta(t)) \rVert,\]

where \(\alpha, \beta: [0, 1] \to [0, 1]\) are continuous non-decreasing surjections and \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

rel_tol, abs_toldouble

Relative and absolute tolerances for parametric search of the Fréchet distance. The search is terminated if the upper boundary a and the lower boundary b satisfy: a - b <= max(rel_tol * a, abs_tol).

Returns:
distdouble

The (continuous) Fréchet distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Notes

This function implements Alt and Godau’s algorithm [1].

References

Examples

>>> fd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]])
1.0...
curvesimilarities.frechet.dfd(P, Q)[source]#

Discrete Fréchet distance between two open polygonal curves.

Let \(\{P_0, P_1, ..., P_n\}\) and \(\{Q_0, Q_1, ..., Q_m\}\) be polyline vertices in metric space. The discrete Fréchet distance between two polylines is defined as

\[\min_{C} \max_{(i, j) \in C} \lVert P_i - Q_j \rVert,\]

where \(C\) is a nondecreasing coupling over \(\{0, ..., n\} \times \{0, ..., m\}\), starting from \((0, 0)\) and ending with \((n, m)\). \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

An \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

An \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The discrete Fréchet distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Notes

This function implements Eiter and Mannila’s algorithm [2].

References

Examples

>>> dfd([[0, 0], [1, 1], [2, 0]], [[0, 1], [2, -4]])
4.0

Dynamic time warping distance#

Dynamic time warping distance.

This module implements only the basic algorithm. If you need advanced features, use dedicated packages such as dtw-python.

curvesimilarities.dtw.dtw(P, Q)[source]#

Dynamic time warping distance.

Let \(\{P_0, P_1, ..., P_n\}\) and \(\{Q_0, Q_1, ..., Q_m\}\) be polyline vertices in metric space. The dynamic time warping distance between two polylines is defined as

\[\min_{C} \sum_{(i, j) \in C} \lVert P_i - Q_j \rVert,\]

where \(C\) is a nondecreasing coupling over \(\{0, ..., n\} \times \{0, ..., m\}\), starting from \((0, 0)\) and ending with \((n, m)\). \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The dynamic time warping distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

dtw_owp

Dynamic time warping distance with optimal warping path.

Notes

This function implements the algorithm described by Senin [3].

References

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dtw(P, Q)
20.0...
curvesimilarities.dtw.dtw_owp(P, Q)[source]#

Dynamic time warping distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The dynamic time warping distance between P and Q, NaN if any vertice is empty.

owpndarray

Indices of P and Q for optimal warping path, empty if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dist, path = dtw_owp(P, Q)
>>> dist / len(path)  # averaged dynamic time warping
1.00...
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T, "x")  
_images/index-1.png
curvesimilarities.dtw.sdtw(P, Q)[source]#

Squared dynamic time warping distance.

The squared dynamic time warping distance is defined as

\[\min_{C} \sum_{(i, j) \in C} \lVert P_i - Q_j \rVert^2,\]

with symbols explained in dtw().

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The squared dynamic time warping distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

dtw

Dynamic time warping distance.

sdtw_owp

Squared dynamic time warping distance with optimal warping path.

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> sdtw(P, Q)
20.0...
curvesimilarities.dtw.sdtw_owp(P, Q)[source]#

Squared dynamic time warping distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The squared dynamic time warping distance between P and Q, NaN if any vertice is empty.

owpndarray

Indices of P and Q for optimal warping path, empty if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dist, path = sdtw_owp(P, Q)
>>> (dist / len(path))**0.5  # quadratic mean dynamic time warping
1.00...
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T, "x")  
_images/index-2.png

Integral Fréchet distance#

Integral Fréchet distance.

curvesimilarities.integfrechet.ifd(P, Q, delta)[source]#

Integral Fréchet distance between two open polygonal curves.

Let \(f, g: [0, 1] \to \Omega\) be curves defined in a metric space \(\Omega\). Let \(\alpha, \beta: [0, 1] \to [0, 1]\) be continuous non-decreasing surjections, and define \(\pi: [0, 1] \to [0, 1] \times [0, 1]\) such that \(\pi(t) = \left(\alpha(t), \beta(t)\right)\). The integral Fréchet distance between \(f\) and \(g\) is defined as

\[\inf_{\pi} \int_0^1 \delta\left(\pi(t)\right) \cdot \lVert \pi'(t) \rVert \mathrm{d}t,\]

where \(\delta\left(\pi(t)\right)\) is a distance between \(f\left(\alpha(t)\right)\) and \(g\left(\beta(t)\right)\) in \(\Omega\) and \(\lVert \cdot \rVert\) is a norm in \([0, 1] \times [0, 1]\). In this implementation, we use the squared Euclidean distance for \(\delta\) and the Manhattan norm for \(\lVert \cdot \rVert\).

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The integral Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

ifd_owp

Integral Fréchet distance with optimal warping path.

Notes

This function implements the algorithm of Brankovic et al [4].

References

Examples

>>> ifd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]], 0.1)
2.0
curvesimilarities.integfrechet.ifd_owp(P, Q, delta)[source]#

Integral Fréchet distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The integral Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

owpndarray

Optimal warping path, empty if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> dist, path = ifd_owp([[0, 0], [0.5, 0], [1, 0]], [[0.5, 1], [1.5, 1]], 0.1)
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T)  
_images/index-3.png

Average Fréchet distance#

Average Fréchet distance and its variants.

curvesimilarities.averagefrechet.qafd(P, Q, delta)[source]#

Quadratic average Fréchet distance between two open polygonal curves.

The quadratic average Fréchet distance is defined as

\[\inf_{\pi} \sqrt{ \frac{ \int_0^1 \delta\left(\pi(t)\right)^2 \cdot \lVert \pi'(t) \rVert \mathrm{d}t }{ \int_0^1 \lVert \pi'(t) \rVert \mathrm{d}t } },\]

where \(\delta\) is the Euclidean distance and \(\pi\) is the continuous nondecreasing path in the parameter space. We use the Manhattan norm for \(\lVert \cdot \rVert\), so the formula can be reduced to

\[\frac{1}{\sqrt{\lvert f \rvert + \lvert g \rvert}} \inf_{\pi} \sqrt{ \int_0^1 \delta\left(\pi(t)\right)^2 \cdot \lVert \pi'(t) \rVert \mathrm{d}t },\]

where \(\lvert \cdot \rvert\) denotes the length of a polygonal curve.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The quadratic average Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

afd

Average Fréchet distance.

qafd_owp

Quadratic average Fréchet distance with optimal warping path.

Examples

>>> qafd([[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]], 0.1)
1.0
curvesimilarities.averagefrechet.qafd_owp(P, Q, delta)[source]#

Quadratic average Fréchet distance and its optimal warping path.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points.

Returns:
distdouble

The quadratic average Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

owpndarray

Optimal warping path, empty if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> dist, path = qafd_owp([[0, 0], [0.5, 0], [1, 0]], [[0.5, 1], [1.5, 1]], 0.1)
>>> import matplotlib.pyplot as plt 
>>> plt.plot(*path.T)  
_images/index-4.png

Utilities#

Utility functions.

curvesimilarities.util.parameter_space(P, Q, p_num, q_num)[source]#

Parameter space betwee two polylines.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

p_num, q_numint

Number of sample points in P and Q, respectively.

Returns:
weightndarray

A p_num by q_num array containing the distance between the points in P and Q.

p_coord, q_coordndarray

Axis coordinates for the parameter space.

p_vert, q_vertndarray

Coordinates for the vertices of polylines.

Examples

Curve space:

>>> P = [[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]]
>>> Q = [[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]]
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*np.array(P).T)  
>>> plt.plot(*np.array(Q).T)  
_images/index-5.png

Parameter space with vertices as dashed lines:

>>> weight, p, q, p_vert, q_vert = parameter_space(P, Q, 200, 100)
>>> plt.pcolormesh(p, q, weight.T)  
>>> plt.vlines(p_vert, 0, q[-1], "k", "--")  
>>> plt.hlines(q_vert, 0, p[-1], "k", "--")  
_images/index-6.png

Free space diagram with Fréchet distance as \(\epsilon\):

>>> eps = fd(P, Q)
>>> plt.pcolormesh(p, q, weight.T < eps, cmap="gray")  
_images/index-7.png

Optimal warping path with integral Fréchet distance:

>>> _, owp = ifd_owp(P, Q, 0.1)
>>> plt.pcolormesh(p, q, weight.T)  
>>> plt.plot(*owp.T, "k")  
_images/index-8.png
curvesimilarities.util.curvespace_path(P, Q, path, sample_num)[source]#

Return point pairs in curve space defined by a path in parameter space.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

pathndarray

A \(N\) by \(2\) array of \(N\) vertices of polyline in curve space.

sample_numint

Number of sample points to be uniformly taken from path.

Returns:
ndarray

A \(n\) by \(2\) by sample_num array of point pairs in curve space.

Examples

>>> P = [[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]]
>>> Q = [[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]]
>>> _, path = ifd_owp(P, Q, 0.1)
>>> pairs = curvespace_path(P, Q, path, 100)
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*np.array(P).T)  
>>> plt.plot(*np.array(Q).T)  
>>> plt.plot(*pairs, "--", color="gray")  
_images/index-9.png
curvesimilarities.util.sample_polyline(vert, param)[source]#

Sample points from a polyline.

Parameters:
vertarray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

paramarray_like

An 1-D array of \(q\) parameters for sampled points. Natural parametrization is used, i.e., the polygonal curve is parametrized by its arc length. Parameters smaller than \(0\) or larger than the total arc length are clipped to the nearest valid value.

Returns:
array_like

A \(q\) by \(n\) array of sampled points.

Examples

>>> vert = [[0, 0], [0.5, 0.5], [1, 0]]
>>> param = np.linspace(0, 2, 10)
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*np.array(vert).T)  
>>> plt.plot(*sample_polyline(vert, param).T, "x")  
_images/index-10.png
curvesimilarities.util.refine_polyline(vert)[source]#

Remove degenerate vertices from a polyline.

Consecutive duplicate vertices, and three or more vertices on a same line are removed.

Parameters:
vertarray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Returns:
array_like

A \(q\) by \(n\) array of \(q \leq p\) vertices in an \(n\)-dimensional space.

Examples

>>> vert = [[0, 0], [0, 0], [1, 0], [2, 0]]
>>> refine_polyline(vert)
array([[0, 0],
       [2, 0]])