CurveSimilarities documentation#

_images/plot-header.png

A collection of curve similarity measures.

There are tons of similar packages published on PyPI, but this one aims to be the most NumPy-friendly and the easiest to use.

API reference#

To reproduce the examples, run the following code first:

import numpy as np
from curvesimilarities import *
from curvesimilarities.util import *

Warning

The similarity functions do not sanitize the vertices of input polygonal curves. Make sure your curves do not have consecutive duplicate vertices.

Fréchet distance#

Continuous and discrete Fréchet distances.

curvesimilarities.frechet.fd(P, Q, rel_tol=0.0, abs_tol=2.220446049250313e-16)[source]#

(Continuous) Fréchet distance between two open polygonal curves.

Let \(f: [0, 1] \to \Omega\) and \(g: [0, 1] \to \Omega\) be curves where \(\Omega\) is a metric space. The Fréchet distance between \(f\) and \(g\) is defined as

\[\inf_{\alpha, \beta} \max_{t \in [0, 1]} \lVert f(\alpha(t)) - g(\beta(t)) \rVert,\]

where \(\alpha, \beta: [0, 1] \to [0, 1]\) are continuous non-decreasing surjections and \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

rel_tol, abs_toldouble

Relative and absolute tolerances for parametric search of the Fréchet distance. The search is terminated if the upper boundary a and the lower boundary b satisfy: a - b <= max(rel_tol * a, abs_tol).

Returns:
distdouble

The (continuous) Fréchet distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Notes

This function implements Alt and Godau’s algorithm [1].

References

Examples

>>> P, Q = [[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]]
>>> fd(np.asarray(P), np.asarray(Q))
1.0...
curvesimilarities.frechet.fd_params(P, Q, rel_tol=0.0, abs_tol=2.220446049250313e-16)[source]#

(Continuous) Fréchet distance and its parameters in curve space.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

rel_tol, abs_toldouble

Relative and absolute tolerances for parametric search of the Fréchet distance. The search is terminated if the upper boundary a and the lower boundary b satisfy: a - b <= max(rel_tol * a, abs_tol).

Returns:
distdouble

The (continuous) Fréchet distance between P and Q, NaN if any vertice is empty.

paramndarray

Parameters of critical points contributing to Fréchet distance.

Notes

The resulting parameters adopt arc-length parametrization [2].

References

Examples

>>> P = np.array([[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]])
>>> Q = np.array([[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]])
>>> _, params = fd_params(P, Q)
>>> from curvesimilarities.util import sample_polyline
>>> pts = [sample_polyline(P, params[:, 0]), sample_polyline(Q, params[:, 1])]
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*P.T); plt.plot(*Q.T)  
>>> plt.plot(*np.array(pts).transpose(2, 0, 1), "--", color="k")  
_images/index-1.png
curvesimilarities.frechet.dfd(P, Q)[source]#

Discrete Fréchet distance between two two ordered sets of points.

Let \(\{P_0, P_1, ..., P_n\}\) and \(\{Q_0, Q_1, ..., Q_m\}\) be ordered sets of points in metric space. The discrete Fréchet distance between two sets is defined as

\[\min_{C} \max_{(i, j) \in C} \lVert P_i - Q_j \rVert,\]

where \(C\) is a nondecreasing coupling over \(\{0, ..., n\} \times \{0, ..., m\}\), starting from \((0, 0)\) and ending with \((n, m)\). \(\lVert \cdot \rVert\) is the underlying metric, which is the Euclidean metric in this implementation.

Parameters:
Pndarray

An \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qndarray

An \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
distdouble

The discrete Fréchet distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Notes

This function implements Eiter and Mannila’s algorithm [3].

References

Examples

>>> P, Q = [[0, 0], [1, 1], [2, 0]], [[0, 1], [2, -4]]
>>> dfd(np.asarray(P), np.asarray(Q))
4.0
curvesimilarities.frechet.dfd_idxs(P, Q)[source]#

Discrete Fréchet distance and its indices in curve space.

Parameters:
Pndarray

An \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qndarray

An \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

Returns:
ddouble

The discrete Fréchet distance between P and Q, NaN if any vertice is empty.

index_1int

Index of point contributing to discrete Fréchet distance in P.

index_2int

Index of point contributing to discrete Fréchet distance in Q.

Examples

>>> P = np.array([[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]])
>>> Q = np.array([[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]])
>>> from curvesimilarities.util import sample_polyline
>>> P_len = np.sum(np.linalg.norm(np.diff(P, axis=0), axis=-1))
>>> P_pts = sample_polyline(P, np.linspace(P_len, 0, 30))
>>> Q_len = np.sum(np.linalg.norm(np.diff(Q, axis=0), axis=-1))
>>> Q_pts = sample_polyline(Q, np.linspace(Q_len, 0, 30))
>>> _, idx0, idx1 = dfd_idxs(P_pts, Q_pts)
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*P_pts.T, "x"); plt.plot(*Q_pts.T, "x")  
>>> plt.plot(*np.array([P_pts[idx0], Q_pts[idx1]]).T, "--")  
_images/index-2.png

Dynamic time warping distance#

Dynamic time warping distance.

This module implements only the basic algorithm. If you need advanced features, use dedicated packages such as dtw-python.

curvesimilarities.dtw.dtw(P, Q, dist='euclidean')[source]#

Dynamic time warping distance between two ordered sets of points.

Let \(\{P_0, P_1, ..., P_n\}\) and \(\{Q_0, Q_1, ..., Q_m\}\) be ordered sets of points in metric space. The dynamic time warping distance between two sets is defined as

\[\min_{C} \sum_{(i, j) \in C} dist\left(P_i, Q_j\right),\]

where \(C\) is a nondecreasing coupling over \(\{0, ..., n\} \times \{0, ..., m\}\), starting from \((0, 0)\) and ending with \((n, m)\).

Parameters:
Pndarray

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qndarray

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

dist{‘euclidean’, ‘squared_euclidean’}

Type of \(dist\). Refer to the Notes section for more information.

Returns:
double

The dynamic time warping distance between P and Q, NaN if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

dtw_owp

Dynamic time warping distance with optimal warping path.

Notes

This function implements the algorithm described by Senin [4].

The following functions are available for \(dist\):

  1. Euclidean distance
    \[dist\left(p, q\right) = \lVert p - q \rVert_2\]
  2. Squared Euclidean distance
    \[dist\left(p, q\right) = \lVert p - q \rVert_2^2\]

References

Examples

>>> P = np.linspace([0, 0], [1, 0], 10)
>>> Q = np.linspace([0, 1], [1, 1], 20)
>>> dtw(P, Q)
20.0...
curvesimilarities.dtw.dtw_owp(P, Q, dist='euclidean')[source]#

Dynamic time warping distance and its optimal warping path.

Parameters:
Pndarray

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qndarray

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

dist{‘euclidean’, ‘squared_euclidean’}

Type of \(dist\). Refer to dtw().

Returns:
dtwdouble

The dynamic time warping distance between P and Q, NaN if any vertice is empty.

owpndarray

Indices of P and Q for optimal warping path, empty if any vertice is empty.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> P = np.array([[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]])
>>> Q = np.array([[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]])
>>> from curvesimilarities.util import sample_polyline
>>> P_len = np.sum(np.linalg.norm(np.diff(P, axis=0), axis=-1))
>>> P_pts = sample_polyline(P, np.linspace(P_len, 0, 30))
>>> Q_len = np.sum(np.linalg.norm(np.diff(Q, axis=0), axis=-1))
>>> Q_pts = sample_polyline(Q, np.linspace(Q_len, 0, 30))
>>> _, owp = dtw_owp(P_pts, Q_pts)
>>> lines = np.array([P_pts[owp[:, 0]], Q_pts[owp[:, 1]]])
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*P_pts.T, "x"); plt.plot(*Q_pts.T, "x")  
>>> plt.plot(*lines.transpose(2, 0, 1), "--", color="gray")  
_images/index-3.png

Integral Fréchet distance#

Integral Fréchet distance.

curvesimilarities.integfrechet.ifd(P, Q, delta, dist='euclidean')[source]#

Integral Fréchet distance between two open polygonal curves.

Let \(f, g: [0, 1] \to \Omega\) be curves defined in a metric space \(\Omega\). Let \(\alpha, \beta: [0, 1] \to [0, 1]\) be continuous non-decreasing surjections, and define \(\pi: [0, 1] \to [0, 1] \times [0, 1]\) such that \(\pi(t) = \left(\alpha(t), \beta(t)\right)\). The integral Fréchet distance between \(f\) and \(g\) is defined as

\[\inf_{\pi} \int_0^1 dist\left(\pi(t)\right) \cdot \lVert \pi'(t) \rVert_1 \mathrm{d}t,\]

where \(dist\left(\pi(t)\right)\) is a distance between \(f\left(\alpha(t)\right)\) and \(g\left(\beta(t)\right)\) and \(\lVert \cdot \rVert_1\) is the Manhattan norm.

Parameters:
Pndarray

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qndarray

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points. Refer to the Reference section for more information.

dist{‘euclidean’, ‘squared_euclidean’}

Type of \(dist\). Refer to the Notes section for more information.

Returns:
double

The integral Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

See also

ifd_owp

Integral Fréchet distance with optimal warping path.

Notes

This function implements the algorithm of Brankovic et al [5].

The following functions are available for \(dist\):

  1. Euclidean distance
    \[dist\left(p, q\right) = \lVert p - q \rVert_2\]

    Note

    This distance is not implemented yet.

  2. Squared Euclidean distance
    \[dist\left(p, q\right) = \lVert p - q \rVert_2^2\]

References

Examples

>>> P, Q = [[0, 0], [0.5, 0], [1, 0]], [[0, 1], [1, 1]]
>>> ifd(np.asarray(P), np.asarray(Q), 0.1, "squared_euclidean")
2.0
curvesimilarities.integfrechet.ifd_owp(P, Q, delta, dist='euclidean')[source]#

Integral Fréchet distance and its optimal warping path.

Parameters:
Pndarray

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qndarray

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

deltadouble

Maximum length of edges between Steiner points. Refer to ifd().

dist{‘euclidean’, ‘squared_euclidean’}

Type of \(dist\). Refer to ifd().

Returns:
ifddouble

The integral Fréchet distance between P and Q, NaN if any vertice is empty or both vertices consist of a single point.

owpndarray

Optimal warping path, empty if any vertice is empty or both vertices consist of a single point.

Raises:
ValueError

If P and Q are not 2-dimensional arrays with same number of columns.

Examples

>>> P = np.array([[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]])
>>> Q = np.array([[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]])
>>> _, path = ifd_owp(P, Q, 0.1, "squared_euclidean")
>>> from curvesimilarities.util import curve_matching
>>> pairs = curve_matching(P, Q, path, 100)
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*P.T); plt.plot(*Q.T)  
>>> plt.plot(*pairs, "--", color="gray")  
_images/index-4.png

Utilities#

Utility functions.

curvesimilarities.util.parameter_space(P, Q, p_num, q_num)[source]#

Parameter space betwee two polylines.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

p_num, q_numint

Number of sample points in P and Q, respectively.

Returns:
weightndarray

A p_num by q_num array containing the distance between the points in P and Q.

p_coord, q_coordndarray

Axis coordinates for the parameter space.

p_vert, q_vertndarray

Coordinates for the vertices of polylines.

Examples

Curve space:

>>> P = np.array([[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]])
>>> Q = np.array([[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]])
>>> import matplotlib.pyplot as plt  
>>> plt.plot(*P.T)  
>>> plt.plot(*Q.T)  
_images/index-5.png

Parameter space with vertices as dashed lines:

>>> weight, p, q, p_vert, q_vert = parameter_space(P, Q, 200, 100)
>>> plt.pcolormesh(p, q, weight.T)  
>>> plt.vlines(p_vert, 0, q[-1], "k", "--")  
>>> plt.hlines(q_vert, 0, p[-1], "k", "--")  
_images/index-6.png

Free space diagram with Fréchet distance as \(\epsilon\):

>>> eps = fd(P, Q)
>>> plt.pcolormesh(p, q, weight.T < eps, cmap="gray")  
>>> plt.vlines(p_vert, 0, q[-1], "k", "--")  
>>> plt.hlines(q_vert, 0, p[-1], "k", "--")  
_images/index-7.png

Optimal warping path with integral Fréchet distance:

>>> _, owp = ifd_owp(P, Q, 0.1, "squared_euclidean")
>>> plt.pcolormesh(p, q, weight.T)  
>>> plt.plot(*owp.T, "k")  
_images/index-8.png
curvesimilarities.util.curve_matching(P, Q, path, sample_num)[source]#

Return pairs of points in curve space defined by a path in parameter space.

Parameters:
Parray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Qarray_like

A \(q\) by \(n\) array of \(q\) vertices in an \(n\)-dimensional space.

pathndarray

A \(N\) by \(2\) array of \(N\) vertices of polyline in curve space.

sample_numint

Number of sample points to be uniformly taken from path.

Returns:
ndarray

A \(n\) by \(2\) by sample_num array of point pairs in curve space.

Examples

>>> P = np.array([[0, 0], [2, 2], [4, 2], [4, 4], [2, 1], [5, 1], [7, 2]])
>>> Q = np.array([[2, 0], [1, 3], [5, 3], [5, 2], [7, 3]])
>>> _, path = ifd_owp(P, Q, 0.1, "squared_euclidean")
>>> pairs = curve_matching(P, Q, path, 100)
curvesimilarities.util.sample_polyline(vert, param)[source]#

Sample points from a polyline.

Parameters:
vertarray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

paramarray_like

An 1-D array of \(q\) parameters for sampled points. Natural parametrization is used, i.e., the polygonal curve is parametrized by its arc length. Parameters smaller than \(0\) or larger than the total arc length are clipped to the nearest valid value.

Returns:
array_like

A \(q\) by \(n\) array of sampled points.

Examples

>>> vert = [[0, 0], [0.5, 0.5], [1, 0]]
>>> param = np.linspace(0, 2, 10)
curvesimilarities.util.refine_polyline(vert)[source]#

Remove degenerate vertices from a polyline.

Consecutive duplicate vertices, and three or more vertices on a same line are removed.

Parameters:
vertarray_like

A \(p\) by \(n\) array of \(p\) vertices in an \(n\)-dimensional space.

Returns:
array_like

A \(q\) by \(n\) array of \(q \leq p\) vertices in an \(n\)-dimensional space.

Examples

>>> vert = [[0, 0], [0, 0], [1, 0], [2, 0]]
>>> refine_polyline(vert)
array([[0, 0],
       [2, 0]])