Data Science | Solving linear equations



Linear algebra is a very fundamental part of data science. When talking about data science, data presentation becomes an important aspect of data science. The data is usually presented in the form of a matrix. The second important thing from a data science perspective is — if this data contains several variables of interest, then it is interesting to know how many of them are very important. And if there is a relationship between these variables, how can these relationships be revealed? Linear algebraic tools allow us to understand this data. So, a data science enthusiast needs to understand this concept well before understanding complex machine learning algorithms.

Matrices and Linear Algebra
There are many ways to represent data, matrices provide a convenient way to organize this data.

  • Matrices can be used to represent selections with multiple attributes in a compact form.
  • Matrices can also be used to represent linear equations in a compact and simple form
  • Linear algebra provides tools for understanding and manipulating matrices to gain useful knowledge from data

Identifying linear relationships between attributes
We identifying linear relationships between attributes using the concept of empty space and null. Go through m and n are the number of equations and variables respectively b is the general RHS commonly used

In general, there are three cases me to understand:

We will look at these three cases independently.

Full row rank and full column rank
For matrix A (mxn)

Full Row Rank Full Column Rank
When all the rows of the matrix are linearly independent When all the columns of the matrix are linearly independent
Data sampling does not present a linear relationship – samples are independent Attributes are linearly independent

Note. In general, regardless of of the size of the matrix, it is established that the rank of the row is always equal to the rank of the column. This means that for any size of the matrix, if we have a certain number of independent rows, we will have the same number of independent columns.
In general, if we have a matrix mxn and m is less than n, then the maximum rank of the matrix can only be m . Thus, the maximum rank is always less than two numbers m and n .

Case 1: m = n

Example 1.1:

 Consider the given matrix equation: 

(1)  

| A | is not equal to zero rank ( A ) = 2 = no. of columns This implies that A is full rank Therefore, the solution for the given example is

Program for finding the rank and inverse matrix and solving a matrix equation in Python:

# First, import
# matrix_rank from numpy.linalg

from numpy.linalg import matrix_rank, inv, solve

 
# 2 x 2 Matrix

A = [[ 1 , 3 ], 

[ 2 , 4 ]]

b = [ 7 , 10 ]

 
# Matrix A rank

print ( "Rank of the matrix is:" , matrix_rank (A))

 
# Matrix A inversion

print ( "Inverse of A:" , inv (A))

  
# Solving the matrix equation

print ( "Solution of linear equations:" , solve (A, b))

Exit:

 Rank of th e matrix is: 2 Inverse of A: [[-2. 1.5] [1. -0.5]] Solution of linear equation: [1. 2.] 

You can link to Numpy | An article on (2)  

| A | is not equal to zero rank ( A ) = 1 nullity = 1 Checking consistency Row (2) = 2 Row (1) The equations are consistent with only one linearly independent equation The solution set for ( , ) is infinite because we have only one linearly independent equation and two variables

Explanation: In the above example, we have only one linearly independent equation, i.e. So if we take then we we have ; if we take then we have Like way we can have many solutions to this equation. We can accept any value (we have an infinite choice for ) and, accordingly, for each value we get one , Therefore, we can say that this equation has infinite solutions.

Example 1.3:

 Consider the given matrix equation: 

(3)  

| A | is not equal to zero rank ( A ) = 1 nullity = 1 Checking consistency 2 Row (1) = Therefore, the equations are inconsistent We cannot find the solution to ( )

Case 2: m & gt; n

  • In this case, the number of variables or attributes is less than the number of equations.
  • Not all equations can be fulfilled here.
  • Thus , this is sometimes referred to as the no solution case.
  • But we can try to find a suitable solution by looking at this case from an optimization perspective.

Optimization perspective

 - Rather than finding a solution to  , we can find an   such that ( ) is minimized - Here,   is a vector - There will be as many error terms as the number of equations - Denote   = e (mx 1); there are m errors  , i = 1: m - We can minimize all the errors collectively by minimizing   - This is the same as minimizing   

So the optimization problem becomes

=
=

Here we can notice that the optimization problem is a function of x . When we solve this optimization problem, it gives us a solution for x . We can get a solution to this optimization problem by differentiating with respect to x and setting the differential to zero.

— Now differentiating f (x) and setting the differential to zero results in

— Assuming all columns are linearly independent

Note. Although this solution x may not satisfy all equations, but it will minimize errors in equations.

Example 2.1:

 Consider the given matrix equation: 

(4)  

m = 3 , n = 2 Using the optimization concept Therefore, the solution for the given linear equation is Substituting in the equation shows

Example 2.2:

 Consider the given matrix equation: 

(5)  

m = 3, n = 2 Using the optimization concept Therefore, the solution for the given linear equation is Substituting in the equation shows

So, the important point to note in case 2 is that if we have more equations than variables, then we can always use a least squares solution, which is . It should be borne in mind that exists if columns A are linearly independent.

Case 3: m & lt; n

  • This case deals with a lot more attributes or variables than equations
  • Here we can get multiple solutions for attributes
  • This is an infinite solution
  • Let`s see how we can choose one solution from many infinite possible solutions.
  • In this case, we have a perspective.Know optimization , which is a Lagrange function here .
    — below optimization problem

    min ()
    such that

    — We can define a Lagrange function

    — Differentiate the Lagrangian with respect to x and set it to zero,then we get,


    Pre-multiplication by A

    From above we can get
    provided that all lines are linearly independent

    Example 3.1:

     Consider the given matrix equation: 

    (6)  

    m = 2, n = 3 Using the optimization concept, The solution for a given sample is ( ) = (-0.2, -0.4, 1) You can easily verify that

    Generalization

  • The above cases cover all possible scenarios that one might encounter when solving linear equations.
  • Concepts The e we use to generalize solutions for all of the above cases is called Moore — Penrose .
  • Singular value decomposition can be used to compute pseudoinverse or generalized inverse ().