What is dimension reduction?
Dimension reduction — it is a method of representing n-dimensional data (multidimensional data with many elements) in 2 or 3 dimensions.
An example of dimensionality reduction can be discussed as a classification problem, i.e. the student will play football or not, which depends on both temperature and humidity, and can be summarized in a single basic characteristic, since both functions are highly correlated. Therefore, we can reduce the number of functions in such tasks. The problem of three-dimensional classification is difficult to imagine, and two-dimensional can be compared with a simple two-dimensional space, and the problem of one-dimensional — with a simple line.
How does t-SNE work?
The t-SNE nonlinear dimensionality reduction algorithm finds patterns in the data based on the similarity of data points to features, point similarity is calculated as the conditional probability that point A will choose point B as its neighbor.
It then tries to minimize the difference between these conditional probabilities (or similarities) in high-dimensional and low-dimensional space to perfectly represent data points in low-dimensional space.
Space and time complexity Applying t-SNE to the MNIST dataset
|
Code # 1: Reading data
< code>
|
Output:
Code # 2: data preprocessing
|
Output:
Code # 3 :
|
Output: