np.linalg.lstsq: The Complete Guide

Numpy is a mathematical library for Python that adds support for large, multi-dimensional arrays and matrices and a large collection of high-level mathematical functions to operate on these arrays.

np.linalg.lstsq

The np.linalg.lstsq() is a library numpy function that returns the least-squares solution to a linear matrix equation. The numpy linalg lstsq() function solves the equation ax = b by computing a vector x that minimizes the Euclidean 2-norm || b – ax ||^2.

The equation may be under-, well-, or over-determined (i.e., the number of linearly independent rows can be less than, equal to, or greater than its number of linearly independent columns).

If a is square and full rank, then x (but for round-off error) is the “exact” solution of the equation. If not, then x gets minimized to Euclidean 2-norm ||b-ax||.

Syntax

Numpy.linalg.lstsq(a, b, rcond=’warn’)

Parameters

  1. a: It depicts a coefficient matrix.
  2. b: It depicts Ordinate or “dependent variable” values. If the parameter is a two-dimensional matrix, then the least square is calculated for each K column of that specific matrix.
  3. Rcond: It is of float datatype. It is a cut-off ratio for smaller singular values of a. In rank determination, singular values are treated as zero if they are smaller than rcond times the largest singular value of a.

Return Value

  1. X: It depicts the least-squares solution. If the input was a two-dimensional matrix, then the solution is always in the K columns of x.
  2. Residuals: It depicts the Sum of residuals or a squared Euclidean 2-norm for each column in b-a*x. If a rank is <N or M<=Nshowspicts an empty array. If b is 1-dimensional, it will be a (1,) shape array. Otherwise, the shape becomes (k,)
  3. Rank: It returns in Int datatype and depicts the rank of matrix A.
  4. S: It depicts the singular values of a.

Note

If b is a matrix, the results are in matrix forms.

Examples

To work with the following example, you must have installed matplotlib in your system and if not, then type the following command to install the library.

python3 -m pip install -U matplotlib

Now, write the following code.

import numpy as np
import matplotlib.pyplot as plt

# x co-ordinates
x = np.arange(0, 9)
A = np.array([x, np.ones(9)])

# linearly generated sequence
y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24]

# obtaining the parameters of regression line
w = np.linalg.lstsq(A.T, y, rcond=None)[0]
print(w)

Output

[ 0.71666667 19.18888889]

Plotting the above output in a line

Write the following in the above file.

import numpy as np
import matplotlib.pyplot as plt

# x co-ordinates
x = np.arange(0, 9)
A = np.array([x, np.ones(9)])

# linearly generated sequence
y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24]

# obtaining the parameters of regression line
w = np.linalg.lstsq(A.T, y, rcond=None)[0]
print(w)

line = w[0]*x + w[1]  # regression line
plt.plot(x, line, 'r-')
plt.plot(x, y, 'o')
plt.show()

Output

Numpy linalg lstsq()

Explanation

Here, we have created an array, namely A with X coordinates, and after that, we have fitted those inputs in the function for getting the regression outputs by using the formula AX=B.

We plotted the same in a graphical format to make it easier to understand.

That’s it for this tutorial.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.