Linear algebra is a very fundamental part of data science. When talking about data science, data presentation becomes an important aspect of data science. The data is usually presented in the form of a matrix. The second important thing from a data science perspective is — if this data contains several variables of interest, then it is interesting to know how many of them are very important. And if there is a relationship between these variables, how can these relationships be revealed? Linear algebraic tools allow us to understand this data. So, a data science enthusiast needs to understand this concept well before understanding complex machine learning algorithms.
Matrices and Linear Algebra
There are many ways to represent data, matrices provide a convenient way to organize this data.
 Matrices can be used to represent selections with multiple attributes in a compact form.
 Matrices can also be used to represent linear equations in a compact and simple form
 Linear algebra provides tools for understanding and manipulating matrices to gain useful knowledge from data
Identifying linear relationships between attributes
We identifying linear relationships between attributes using the concept of empty space and null. Go through Generalized linear equations are represented as below:
In general, there are three cases me to understand:
We will look at these three cases independently.
Full row rank and full column rank
For matrix A (mxn)
Full Row Rank  Full Column Rank 

When all the rows of the matrix are linearly independent  When all the columns of the matrix are linearly independent 
Data sampling does not present a linear relationship  samples are independent  Attributes are linearly independent 
Note. In general, regardless of of the size of the matrix, it is established that the rank of the row is always equal to the rank of the column. This means that for any size of the matrix, if we have a certain number of independent rows, we will have the same number of independent columns.
In general, if we have a matrix mxn and m is less than n, then the maximum rank of the matrix can only be m . Thus, the maximum rank is always less than two numbers m and n .
Case 1: m = n
Example 1.1:
Consider the given matrix equation:(1)
 A  is not equal to zero rank ( A ) = 2 = no. of columns This implies that A is full rankTherefore, the solution for the given example is
Program for finding the rank and inverse matrix and solving a matrix equation in Python:

Exit:
Rank of th e matrix is: 2 Inverse of A: [[2. 1.5] [1. 0.5]] Solution of linear equation: [1. 2.]
You can link to Numpy  An article on Consider the given matrix equation:
 A  is not equal to zero rank ( A ) = 1 nullity = 1 Checking consistencyExplanation: In the above example, we have only one linearly independent equation, i.e. So if we take then we we have ; if we take then we have Like way we can have many solutions to this equation. We can accept any value (we have an infinite choice for ) and, accordingly, for each value we get one , Therefore, we can say that this equation has infinite solutions.
Example 1.3:
Consider the given matrix equation:(3)
 A  is not equal to zero rank ( A ) = 1 nullity = 1 Checking consistency2 Row (1) = Therefore, the equations are inconsistent We cannot find the solution to ( )
Case 2: m" n
 In this case, the number of variables or attributes is less than the number of equations.
 Not all equations can be fulfilled here.
 Thus , this is sometimes referred to as the no solution case.
 But we can try to find a suitable solution by looking at this case from an optimization perspective.
Optimization perspective
 Rather than finding a solution to, we can find an such that ( ) is minimized  Here, is a vector  There will be as many error terms as the number of equations  Denote = e (mx 1); there are m errors , i = 1: m  We can minimize all the errors collectively by minimizing  This is the same as minimizing
So the optimization problem becomes
=
=
Here we can notice that the optimization problem is a function of x . When we solve this optimization problem, it gives us a solution for x . We can get a solution to this optimization problem by differentiating with respect to x and setting the differential to zero.
— Now differentiating f (x) and setting the differential to zero results in
— Assuming all columns are linearly independent
Note. Although this solution x may not satisfy all equations, but it will minimize errors in equations.
Example 2.1:
Consider the given matrix equation:(4)
m = 3 , n = 2 Using the optimization conceptTherefore, the solution for the given linear equation is Substituting in the equation shows
Example 2.2:
Consider the given matrix equation:(5)
m = 3, n = 2 Using the optimization conceptTherefore, the solution for the given linear equation is Substituting in the equation shows
So, the important point to note in case 2 is that if we have more equations than variables, then we can always use a least squares solution, which is . It should be borne in mind that exists if columns A are linearly independent.
Case 3: m "n
In this case, we have a perspective.Know optimization , which is a Lagrange function here .
— below optimization problem
min ()
such that
— We can define a Lagrange function
— Differentiate the Lagrangian with respect to x and set it to zero,then we get,
Premultiplication by A
From above we can get
provided that all lines are linearly independent
Example 3.1:
Consider the given matrix equation:(6)
m = 2, n = 3 Using the optimization concept,The solution for a given sample is ( ) = (0.2, 0.4, 1) You can easily verify that
Generalization