Change language

# ML | Variational Bayesian inference for a Gaussian mixture

| | |

The Gaussian mixture model assumes that the data should be divided into clusters in such a way that each data point in a given cluster corresponds to a certain multivariate Gaussian distribution, and the multivariate Gaussian distributions of each cluster are independent of each other. To cluster data in such a model, it is necessary to calculate the posterior probability of a data point belonging to a given cluster given the observed data. An exemplary method for this purpose is the Bayeux method. But for large datasets, calculating the marginal probabilities is very tedious. Since there is only a need to find the most likely cluster for a given point, approximation methods can be used as they reduce mechanical work. One of the best approximate methods is the use of Variational Bayesian inference. The method uses the concepts KL divergences and mean field approximation .

The following steps demonstrate how to implement variational Bayesian inference in a Gaussian mixture model using Sklearn. Used data — these are credit card details which can be downloaded from Kaggle .

Step 1: Import required libraries

 ` import ` ` numpy as np ` ` import ` ` pandas as pd ` ` import ` ` matplotlib.pyplot as plt ` ` from ` ` sklearn. mixture ` ` import ` ` BayesianGaussianMixture ` ` from ` ` sklearn.preprocessing ` ` import ` ` normalize, StandardScaler ` ` from ` ` sklearn.decomposition ` ` import ` ` PCA `

Step 2: Load and clear data

 ` # Change workplace to data location ` ` cd ` ` "C: UsersDevDesktopKaggleCredit_Card" `   ` # Loading data ` ` X ` ` = ` ` pd.read_csv (` ` ’CC_GENERAL.csv’ ` `) ` ` `  ` # Remove the CUST_ID column from the data ` ` X ` ` = ` ` X.drop (` ` ’CUST_ID’ ` `, axis ` ` = ` ` 1 ` `) `   ` # Handling missing values ​​` ` X.fillna (method ` ` = ` ` ’ffill’ ` `, inplace ` ` = ` ` True ` `) `   ` X.head () `

Step 3: Data preprocessing

 ` # Scale data to bring all attributes to comparable level ` ` scaler = StandardScaler () `` X_scaled = scaler.fit_transform (X)    # Normalize the data so that the data # approximately follows a Gaussian distribution X_normalized = normalize (X_scaled)   # Convert numpy array to panda DataFrame X_normalized = pd.DataFrame (X_normalized)   # Renaming columns X_normalized.columns = X.columns   X_normalized.head () `

Step 4: Downsize the data to make it renderable

 ` # Reducing data size ` ` pca ` ` = ` ` PCA (n_components ` ` = ` ` 2 ` `) ` ` X_principal ` ` = ` ` pca.fit_transform (X_normalized) `   ` # Convert minified data to pandas data frame ` ` X_principal ` ` = ` ` pd.DataFrame (X_principal) `   ` # Renaming columns ` ` X_principal.columns ` ` = ` ` [` ` ’P1’ ` `, ` ` ’P2’ ` `] ` ` `  ` X_principal.head () `

The primary two parameters of the Bayesian Gaussian class blends are n_components and covariance_type .

1. n_components: defines the maximum number of clusters in the data.
2. covariance_type: describes the type of use covariance parameters.

You can read about all the other attributes in his documentation .

In the next steps, the n_components parameter will be fixed at level 5, and the covariance_type parameter will be changed for all possible values ​​to visualize the effect of this parameter on clustering.

Step 5: Build clustering models for different covariance_type values ​​and visualize the results

a) covariance_type = & # 39; full & # 39;

 ` # Building and training the model ` ` vbgm_model_full ` ` = ` ` BayesianGaussianMixture (n_components ` ` = ` ` 5 ` `, covariance_type ` ` = ` ` ’full’ ` `) ` ` vbgm_model_full.fit (X_normalized) `   ` # Storing tags ` ` labels_full ` ` = ` ` vbgm_model_full.predict (X) ` ` print ` ` (` ` set ` ` (labels_full)) `

 ` colors ` ` = ` ` {} ` ` colors [` ` 0 ` `] ` ` = ` ` ’r’ ` ` colors [` ` 1 ` `] ` ` = ` ` ’g’ ` ` colors [` ` 2 ` `] ` ` = ` ` ’b’ ` ` colors [` ` 3 ` `] ` ` = ` ` ’k’ ` ` `  ` # Build a color vector for each data point ` ` cvec ` ` = ` ` [colors [label] ` ` for ` ` label ` ` in ` ` labels_full] `   ` # Define a scatter plot for each color ` ` r ` ` = ` ` plt.scatter (X_principal [` ` ’ P1’ ` `], X_principal [` ` ’P2’ ` `], color ` ` = ` ` ’r’ ` `); ` ` g ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’g’ ` `); ` ` b ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’b’ ` `); ` ` k ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’k’ ` `); `   ` # Building clustered data ` ` plt.figure (figsize ` ` = ` ` (` ` 9 ` `, ` ` 9 ` `)) ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` ` ’P2’ ` `], c ` ` = ` ` cvec) ` ` plt.legend ((r, g, b, k), (` `’ Label 0’ ` `, ` ` ’Label 1’ ` `, ` ` ’Label 2’ ` `, ` `’ Label 3’ ` `)) ` ` plt.show () `

b) covariance_type = & # 39; related & # 39;

 ` # Building and training the model ` ` vbgm_model_tied ` ` = ` ` BayesianGaussianMixture (n_components ` ` = ` ` 5 ` `, covariance_type ` ` = ` ` ’tied’ ` `) ` ` vbgm_model_tied.fit (X_normalized ) `   ` # Storing tags ` ` labels_tied ` ` = ` ` vbgm_model_tied.predict (X) ` ` print ` ` (` ` set ` ` (labels_tied)) `

 ` colors ` ` = ` ` {} ` ` colors [` ` 0 ` `] ` ` = ` ` ’r’ ` ` colors [` ` 2 ` `] ` ` = ` ` ’g’ ` ` colors [` ` 3 ` `] ` ` = ` ` ’b’ ` ` colo urs [` ` 4 ` `] ` ` = ` `’ k’ `   ` # Build a color vector for each data point ` ` cvec ` ` = ` ` [colors [label] ` ` for ` ` label ` ` in ` ` labels_tied] `   ` # Define a scatter plot for each color ` ` r ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’r’ ` `); ` ` g ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’g’ ` `); ` ` b ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’b’ ` `); ` ` k ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’k’ ` `); `   ` # Building clustered data ` ` plt.figure (figsize ` ` = ` ` (` ` 9 ` `, ` ` 9 ` `)) ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` ` ’P2’ ` `], c ` ` = ` ` cvec) ` ` plt.legend ((r, g, b, k), (` `’ Label 0’ ` `, ` ` ’Label 2’ ` `, ` ` ’Label 3’ ` `, ` `’ Label 4’ ` `)) ` ` plt.show () `

c) covariance_type = & # 39; diag & # 39;

 ` # Building and training the model ` ` vbgm_model_diag ` ` = ` ` BayesianGaussianMixture (n_components ` ` = ` ` 5 ` `, covariance_type ` ` = ` ` ’diag’ ` `) ` ` vbgm_model_diag.fit (X_normalized ) `   ` # Storing tags ` ` labels_diag ` ` = ` ` vbgm_model_diag.predict (X) ` ` print ` ` (` ` set ` ` (labels_diag)) `

 ` colors ` ` = ` ` {} ` ` colors [` ` 0 ` `] ` ` = ` ` ’r’ ` ` colors [` ` 2 ` `] ` ` = ` ` ’g’ ` ` colors [` ` 4 ` `] ` ` = ` ` ’k’ `   ` # Build a color vector for each data point ` ` cvec ` ` = ` ` [colors [label] ` ` for ` ` label ` ` in ` ` labels_diag] `   ` # Define a scatter plot for each colors ` ` r ` ` = ` ` plt.scatter ( X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’r’ ` `); ` ` g ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’g’ ` `); ` ` k ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’k’ ` `); `   ` # Building clustered data ` ` plt.figure (figsize ` ` = ` ` (` ` 9 ` `, ` ` 9 ` `)) ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` ` ’P2’ ` `], c ` ` = ` ` cvec) ` ` plt.legend ((r, g, k), (` ` ’Label 0’ ` `, ` `’ Label 2’ ` `, ` ` ’Label 4’ ` `)) ` ` plt.show () `

d) covariance_type = & # 39; spherical & # 39;

 ` # Building and training the model ` ` vbgm_model_spherical ` ` = ` ` BayesianGaussianMixture (n_components ` ` = ` ` 5 ` `, ` ` covariance_type ` ` = ` ` ’spherical’ ` `) ` ` vbgm_model_spherical.fit (X_normalized) `   ` # Storing tags ` ` labels_spherical ` ` = ` ` vbgm_model_sp herical.predict (X) ` ` print ` ` (` ` set ` ` (labels_spherical)) `

 ` colors ` ` = ` ` {} ` ` colors [` ` 2 ` `] ` ` = ` `’ r’ ` ` colors [` ` 3 ` `] ` ` = ` ` ’b’ ` ` `  ` # Build a color vector for each data point ` ` cvec ` ` = ` ` [colors [label] ` ` for ` ` label ` ` in ` ` labels_spherical] `   ` # Define a scatter plot for each color ` ` r ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` ` ’P2’ ` `], color ` ` = ` ` ’r’ ` `); ` ` b ` ` = ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` `’ P2’ ` `], color ` ` = ` ` ’b’ ` `); `   ` # Building clustered data ` ` plt.figure (figsize ` ` = ` ` (` ` 9 ` `, ` ` 9 ` `)) ` ` plt.scatter (X_principal [` ` ’P1’ ` `], X_principal [` ` ’P2’ ` `], c ` ` = ` ` cvec) ` ` plt.legend ((r, b), (` ` ’Label 2’ ` `, ` ` ’Label 3’ ` `)) ` ` plt.show () `