GID

Simple mathematical tricks in Python

· Georgios Is. Detorakis · 10 minutes read

In this post, you can find some helpful mathematical tips and tricks in the Python programming language.

Intersection points of curves

We often need to compute the points where two lines intersect or where a function intersects the x-axis (zero crossing). Here, we provide a Python script that does precisely that. Computes the points where two curves (lines, time series, etc.) intersect.

Hence, we provide a simple Python code that can compute the points $(x, y)$ where two curves, $ f(t) $ and g(t), intersect. The first step, and the easy one, is to take two points, one from each curve, $ (t_1, Y_f) $, and $ (t_1, Y_g) $, compute their distance, and check if it is smaller than an $ \epsilon $, where $ \epsilon $ is a tiny number close to zero. If that’s the case, then we are done and have found the intersection point being $ (t_1, Y_f) $. However, if the distance is larger than $ \epsilon $, we have to check for any crossing point. We first consider a new function $ h(t) = f(t) - g(t $, and we evaluate that function on the points $ (t_0, Y_f^0) $, $ (t_0, Y_g^0) $, $ (t_1, Y_f^1) $, and $ (t_1, Y_g^1) $. Thus, we obtain $ \delta_0 = Y_f^0 - Y_g^0 $ and $ \delta_1 = Y_f^1 - Y_g^1 $. If the product $ \delta_0 \delta_1 < 0 $, then from Bolzano’s theorem, we know that the two lines must intersect, and the intersection point $ (t^{\ast}, y^{\ast}) $ is:

$$ (t^{\ast}, y^{\ast}) = \Big( \frac{-t_0 \delta_1 + t_1 \delta_0}{\Delta}, \frac{Y_f^0 Y_g^1 - Y_f^1 Y_g^0}{\Delta} \Big), $$ where $ \Delta = Y_f^0 - Y_g^0 - Y_f^1 + Y_g^1 $. The coordinates of the point $ (t^{\ast}, y^{\ast}) $ are derived by computing the following determinant(s) (which gives us the intersection point):

When one runs the Python code provided in the above snippet, one obtains the results shown in the Figure below.

import numpy as np
import matplotlib.pylab as plt
import matplotlib.style as style

style.use('tableau-colorblind10')

params = {'font.size': 14,
          }
plt.rcParams.update(params)

eps = 1e-12


def findCrossings(X, Y, t=None):
    if len(X) != len(Y):
        raise ValueError

    if t is None:
        t = np.arange(len(X))

    crossings = []
    deltaOld = 0
    tPrev, xOld, yOld = None, None, None
    for tCurr, xNew, yNew in zip(t, X, Y):
        deltaNew = yNew - xNew
        # When the difference of two points is virtually zero then accept that
        # as a crossing point (x, y)
        if np.abs(deltaNew) < eps:
            crossings.append((tCurr, xNew))
        # Otherwise check if the product of the differences is negative and then 
        # from Bolzano's theorem we know that there should be a crossing. To find
        # the intersection we compute the determinant as it is defined in the text
        # above and compute the crossing point (x, y)
        elif deltaNew * deltaOld < 0:
            denom = deltaOld - deltaNew
            crossings.append(((-deltaNew * tPrev + deltaOld * tCurr) / denom,
                              (xNew * yOld - yNew * xOld) / denom))
        tPrev, xOld, yOld, deltaOld = tCurr, xNew, yNew, deltaNew
    return crossings


def nicePlot(t, x, y, crossing_pts, ax):
    """ A simple plotting function
    """
    ax.plot(t, x, 'b-', alpha=0.7, lw=2)
    ax.plot(t, y, 'k-', alpha=0.7, lw=2)
    ax.plot(*zip(*crossing_pts),
            ls="",
            color='orange',
            marker='o',
            alpha=1.,
            ms=10)
    ax.set_xticks([])
    ax.set_yticks([])


if __name__ == "__main__":
    N = 20

    fig = plt.figure(figsize=(19, 5))
    ax = fig.add_subplot(131)
    # Two curves that do not intesect thus no crossing points
    t = np.linspace(-np.pi, np.pi, N)
    x = np.sin(t * .08+1.4)*np.random.uniform(0.5, 0.9) + 1
    y = -np.cos(t * .07+.1)*np.random.uniform(0.7, 1.0) + 1
    crossing_pts = findCrossings(x, y, t=t)
    nicePlot(t, x, y, crossing_pts, ax)

    # Two time series intesecting at multiple points
    ax = fig.add_subplot(132)
    t = np.arange(N)
    x = np.random.normal(0, 1, (N,))
    y = np.random.normal(0, 1, (N,))
    crossing_pts = findCrossings(x, y, t=t)
    nicePlot(t, x, y, crossing_pts, ax)

    # x^2 intersecting at two points with the line x
    ax = fig.add_subplot(133)
    t = np.linspace(-1, 1, N)
    x = t**2
    y = t
    crossing_pts = findCrossings(x, y, t=t)
    nicePlot(t, x, y, crossing_pts, ax)

    plt.show()


When one runs the Python code provided in the above snippet then they obtain the results shown in the Figure below.

Figure 1. Three exampels of identifying crossing points between two curves. The orange discs indicate the crossing points.

Figure 1. Three exampels of identifying crossing points between two curves. The orange discs indicate the crossing points.

Eigenvalue Level Repulsion

Imaging we have two matrices ${\bf A}, {\bf B}$ $\in \mathbb{R}^n$, that are both symmetric. Then, we know that both matrices have $n$ eigenvalues (multiplicity). Let’s start now, gradually morphing ${\bf A}$ to ${\bf B}$ using the following formula

$$ {\bf A}[t] = (1 - t) {\bf A} + t {\bf B}, $$

where $t \in [0, 1]$. What will happen is that matrix ${\bf A}$ will have multiple eigenvalues with probability zero, given we have chosen the matrices properly from the set of all real symmetric matrices. In other words, there will be no time $t$ when matrix ${\bf A}$ won’t have $n$ eigenvalues. We can see an example in Figure $2$ below.

Figure 2. Eigenvalue level repulsion. Notice that as we move from $t=0$ to $t=1$ the eigenvalues of matrix ${\bf A}$ morph to those of ${\bf B}$ and at any instance matrix ${\bf A}$ has $10$ eigenvalues (see the dimension of the matrix ${\bf A}$ in the code snippet below, which is $10$).

Figure 2. Eigenvalue level repulsion. Notice that as we move from $t=0$ to $t=1$ the eigenvalues of matrix ${\bf A}$ morph to those of ${\bf B}$ and at any instance matrix ${\bf A}$ has $10$ eigenvalues (see the dimension of the matrix ${\bf A}$ in the code snippet below, which is $10$).

The Python code below shows how one can compute the eigenvalue level repulsion.

import numpy as np
import matplotlib.pylab as plt
import matplotlib.style as style

style.use('tableau-colorblind10')

params = {'font.size': 14,
          }
plt.rcParams.update(params)


def keig(A, k):
    # Compute the eigenvalues of a matrix, sort them and then choose the k fisrt of them
    kth_eig = np.sort(np.linalg.eigvals(A))[k]
    return kth_eig


if __name__ == '__main__':
    np.random.seed(1)

    # Randomly select two matrices A and B and make sure they are symmetric
    n, nnodes = 10, 50
    A = np.random.randn(n, n)
    A = A + A.conj().T
    B = np.random.randn(n, n)
    B = B + B.conj().T

    # Define the discretization of interval [0, 1]
    t = np.linspace(0, 1, nnodes)

    # Run over t and compute the eigenvalues
    res = []
    for k in range(n):
        tmp_mat = []
        for i in range(nnodes):
            tmp_mat.append(keig((1 - t[i]) * A + t[i] * B, k))
        res.append(tmp_mat)

    # Plot the results
    min_ = int(min(min(res)))
    max_ = int(max(max(res)))

    fig = plt.figure(figsize=(13, 6))
    ax = fig.add_subplot(111)
    for i in range(n):
        # label numbering should start from 1 not 0   
        ax.plot(t, res[i], '-x', label="Eigenvalue "+str(i+1))
    ax.set_xlabel(r"$t$", fontsize=20, weight="bold")
    ax.set_ylabel(r"$\lambda_{A(t)}$", fontsize=20, weight="bold")
    ax.set_xticks([i/10 for i in range(11)])
    ticks = ax.get_xticks()
    ax.set_xticklabels(ticks, fontsize=14, weight='bold')
    ax.set_yticks([i for i in range(min_, max_)])
    ticks = ax.get_yticks()
    ax.set_yticklabels(ticks, fontsize=14, weight='bold')

    pos = ax.get_position()
    ax.set_position([pos.x0, pos.y0, pos.width * 0.9, pos.height])
    ax.legend(loc='center right', bbox_to_anchor=(1.25, 0.5))

    ax.set_title("Eigenvalue Repulsion")
    plt.show()

Linear separability

Two sets $ A $ and $ B $ in an $ n $ dimensional Euclidean space are linear separable if there exist $ n + 1 $ numbers $ w_i \in \mathbb{R} $ such that every point $ a \in A $ satisfies

$$ \sum_{i=1}^{n}w_i a_i > k, $$

and every point $ b \in B $ satisfies

$$ \sum_{i=1}^{n}w_i b_i < k, $$

where $ k \in \mathbb{R} $ defines a linear border (e.g., a line) between data points of the two sets.

In layperson’s terms, let’s say we have two two-dimensional data sets (e.g., each data point is described by two coordinates, x and y). These two sets are linearly separable if we can draw at least one line that will separate the points of set A from those of set B.

Many times we have to solve a classification or clustering problem. If we could know a priori if the involved sets are linearly separable, we could choose the appropriate classification algorithm. For instance, if the data sets are not linearly separable, we won’t be able to use a linear classifier.

Therefore, one way to know if the sets at hand are linear separable is to compute the convex hull of each set and check if those convex hulls intersect or one contains the other, or they overlap. If any of those three conditions is true, then we know that the two sets are not linearly separable. In Python, we can quickly check that using the function ConvexHull of Scipy. Here is an example:

import matplotlib.pylab as plt

from sklearn.datasets import make_moons, make_blobs
from scipy.spatial import ConvexHull


if __name__ == '__main__':
    S = 8       # size of scatter plot point
    blobs = make_blobs(n_samples=100, centers=2, random_state=13)
    moons = make_moons(n_samples=100)

    fig = plt.figure(figsize=(13, 11))
    ax1 = fig.add_subplot(221)
    Xb, Yb = blobs
    x1b = Xb[Yb == 0]
    x2b = Xb[Yb == 1]
    ax1.scatter(x1b[:, 0], x1b[:, 1], s=S)
    ax1.scatter(x2b[:, 0], x2b[:, 1], c="orange", s=S)
    ax1.set_xticks([])
    ax1.set_yticks([])

    ax2 = fig.add_subplot(222)
    X, Y = moons
    x1 = X[Y == 0]
    x2 = X[Y == 1]
    ax2.scatter(x1[:, 0], x1[:, 1], s=S)
    ax2.scatter(x2[:, 0], x2[:, 1], c="orange", s=S)
    ax2.set_xticks([])
    ax2.set_yticks([])

    ax3 = fig.add_subplot(223)
    ch1 = ConvexHull(x1b)			# Compute the convex hull
    ax3.scatter(x1b[:, 0], x1b[:, 1], s=S)
    ax3.plot(x1b[ch1.vertices, 0], x1b[ch1.vertices, 1], lw=2, c='k')
    ch2 = ConvexHull(x2b)
    ax3.scatter(x2b[:, 0], x2b[:, 1], s=S)
    ax3.plot(x2b[ch2.vertices, 0], x2b[ch2.vertices, 1], lw=2, c='k')
    ax3.set_xticks([])
    ax3.set_yticks([])

    ax4 = fig.add_subplot(224)
    ch1 = ConvexHull(x1)
    ax4.scatter(x1[:, 0], x1[:, 1], s=S)
    ax4.plot(x1[ch1.vertices, 0], x1[ch1.vertices, 1], lw=2, c='k')
    ch2 = ConvexHull(x2)
    ax4.scatter(x2[:, 0], x2[:, 1], s=S)
    ax4.plot(x2[ch2.vertices, 0], x2[ch2.vertices, 1], lw=2, c='k')
    ax4.set_xticks([])
    ax4.set_yticks([])

    plt.savefig("convec_hulls.png")
    plt.show()


Figure 3. Blobs and moons data sets along with their convex hulls. The blobs are linear separable since the convex hull of the blue set does not intersect or overlap with the convex hull of the orange data set. On the other hand, the moons data set is not linear separable.

Figure 3. Blobs and moons data sets along with their convex hulls. The blobs are linear separable since the convex hull of the blue set does not intersect or overlap with the convex hull of the orange data set. On the other hand, the moons data set is not linear separable.

Positive definite matrix

Check if a given matrix $ \bf{A} $ is positive definite. If all the eigenvalues of matrix $ \bf{A} $ are positive then the matrix is positive definite.

$ A = np.array([[1, 2], [2, 1]])
$ print(A)
[[1 2]
 [2 1]]

$ np.all(np.linalg.eigvals(A) > 0)
False       # A is not a positive definite matrix

$ A = np.array([[2, -1, 0], [-1, 2, -1], [0, -1, 2]])
print(A)
[[ 2 -1  0]
 [-1  2 -1]
 [ 0 -1  2]]

$ np.all(np.linalg.eigvals(A) > 0)
True        # A is positive definite

Random matrix with predetermined condition number

You can generate a random matrix with predetermined condition number by following method:

$ cond = 3.0
$ n, m = 3, 2
$ A = np.random.normal(0, 1, (m, n))
$ print(A)

[[ 0.24143692 -0.61944458]
 [ 0.49427012  1.34003024]
 [-1.08271826  0.91021725]]

$ k = min(m, n)
$ U, S, V = np.linalg.svd(A)

$ S = S[0] * (1.0 - ((cond - 1.0) / cond) * (S[0] - S) / (S[0] - S[-1]))

$ SMAT = np.zeros((m, n), dtype=complex) + 1e-9
$ smat[:k, :k] = np.diag(S)

$ B = U @ (SMAT @ V.T)

$ print("Desired condition number: ", cond)
Desired condition number: 3.0
$ print("Actual condition number", np.linalg.cond(B))
Actual condition number: 2.9999999999999973

Integer operations

Fast integer division by two (rounded down). In this case, we have to perform a bit shift to the right by $ k $. $ k $ indicates the power of two $ 2^k $.

# n >> k

$ 6 >> 1
3
$ 6 >> 2
0

Fast integer multiplication with two by left-bit-shift.

# n << k

$ 6 << 1
12
$ 6 << 24

Check if an integer $ n $ is even or odd by performing a binary and operation. If the result of the following operation is 0, then n is even, otherwise is an odd number.

# n & 1

$ 6 & 1
0
$ 5 & 1
1

You can find the maximum power-of-two that divides an integer $ n $ by performing an and operation between the integers $ n $ and its additive inverse $ -n $.

# -n & n

$ -5 & 5
1
$ -6 & 6
2
$ -12 & 12
4