# ML | T-distributed stochastic neighbor embedding (t-SNE) algorithm

What is dimension reduction?
Dimension reduction — it is a method of representing n-dimensional data (multidimensional data with many elements) in 2 or 3 dimensions.

An example of dimensionality reduction can be discussed as a classification problem, i.e. the student will play football or not, which depends on both temperature and humidity, and can be summarized in a single basic characteristic, since both functions are highly correlated. Therefore, we can reduce the number of functions in such tasks. The problem of three-dimensional classification is difficult to imagine, and two-dimensional can be compared with a simple two-dimensional space, and the problem of one-dimensional — with a simple line.

How does t-SNE work?
The t-SNE nonlinear dimensionality reduction algorithm finds patterns in the data based on the similarity of data points to features, point similarity is calculated as the conditional probability that point A will choose point B as its neighbor.
It then tries to minimize the difference between these conditional probabilities (or similarities) in high-dimensional and low-dimensional space to perfectly represent data points in low-dimensional space.

Space and time complexity

### Applying t-SNE to the MNIST dataset

 ` # Import required modules. ` ` import ` ` numpy as np ` ` import ` ` pandas as pd ` ` import ` ` matplotlib.pyplot as plt ` ` from ` ` sklearn.manifold ` ` import ` ` TSNE ` ` from ` ` sklearn.preprocessing ` ` import ` ` StandardScaler `

< code>

 ` # Reading data using pandas ` ` df ` ` = ` ` pd.read_csv (` ` ’mnist_train.csv’ ` `) ` ` `  ` # print the first five lines df ` ` print ` ` (df.head (` ` 4 ` `)) `   ` # save tags to l variable. ` ` l ` ` = ` ` df [` `’ label’ ` `] `   ` # Remove the tag and save the data pixels per d. `  ` d ` ` = ` ` df.drop (` ` "label" ` `, axis ` ` = ` ` 1 ` `) `

Output: Code # 2: data preprocessing

 ` # Data preprocessing: data standardization ` ` from ` ` sklearn.preprocessing ` ` import ` ` StandardScaler `   ` standardized_data ` ` = ` ` StandardScale r (). fit_transform (data) `   ` print ` ` (standardized_data.shape) `

Output: Code # 3 :

 ` # TSNE ` ` # Choose the best 1000 points as TSNE ` ` # takes a long time for 15K points ` ` data_1000 ` ` = ` ` standardized_data [` ` 0 ` `: ` ` 1000 ` `,:] ` ` labels_1000 ` ` = ` ` labels [ 0 : 1000 ] ``   model = TSNE (n_components = 2 , random_state = 0 ) # setting parameters # number of components = 2 # default bewilderment = 30 # default learning rate = 200 # default Maximum number of iterations # for optimization = 1000    `` tsne_data = model.fit_transform (data_1000) `     ` # create a new data frame that ` ` # help us build the results data ` ` tsne_data ` ` = ` ` np.vstack ((tsne_data .T, labels_1000)). T ` ` tsne_df ` ` = ` ` pd.DataFrame (data ` ` = ` ` tsne_data, ` ` columns ` ` = ` ` (` ` "Dim_1" ` `, ` ` "Dim_2" ` `, ` ` "label" ` `)) ` ` `  ` # Building the cne result ` ` sn.FacetGrid (tsne_df, hue ` ` = ` ` "label" ` `, size ` ` = ` ` 6 ` `). ` ` map ` ` (` ` plt.scatter, ` ` ’Dim_1’ ` `, ` `’ Dim_2’ ` `). add_legend () `   ` plt.show () `

Output: ## Shop Learn programming in R: courses

\$FREE Best Python online courses for 2022

\$FREE Best laptop for Fortnite

\$399+ Best laptop for Excel

\$ Best laptop for Solidworks

\$399+ Best laptop for Roblox

\$399+ Best computer for crypto mining

\$499+ Best laptop for Sims 4

\$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

PythonStackOverflow

Check if one list is a subset of another in Python

PythonStackOverflow

How to specify multiple return types using type-hints

PythonStackOverflow

Printing words vertically in Python

PythonStackOverflow

Python Extract words from a given string

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

PythonStackOverflow

Python os.path.join () method

PythonStackOverflow

Flake8: Ignore specific warning for entire file

## Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries