+

Seaborn | Distribution areas

Seaborn — it is a Python data visualization library based on Matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. This article discusses sea bay propagation plots that are used to study 1D and 2D distributions. In this article, we will discuss 4 types of distribution plots, namely:

  1. joinplot
  2. distplot
  3. pairplot
  4. rugplot

In addition to providing various kinds of visualization plots, Seaborn also contains several built-in datasets. We will be using the tips dataset in this article. The "tip" dataset contains information about people who likely had a meal at a restaurant and whether they tip, their age, gender, and so on. Let`s take a look at this.

Code :

# import via required libraries

import seaborn as sns

import matplotlib.pyplot as plt % matplotlib inline

 
# ignore warnings

from warnings import filterwarnings

 
# load dataset

df = sns.load_dataset ( `tips` )

 
# first five records of the dataset
df.head ()

Now let`s move on to the plots.

Displot

It is used mainly for a one-dimensional set of observations and renders it using a histogram, i.e. only one observation, and hence we select one specific column of the dataset. 
Syntax :

 distplot (a [, bins, hist, kde, rug, fit, ...]) 

Example:

# set the plot background style

sns. set_style ( `whitegrid` )

sns .distplot (df [ `total_bill` ], kde = False , color = `red` , bins = 30 )

Exit:

Explanation :

  • KDE stands for" Kernel Density Determination " and this is another type of plot in the sea bay.
  • bins is used to set the number of bins you want on your graph and it really depends on your dataset.
  • color is used to indicate the color of the plot

Now, looking at this, we can say that most of the total given count lies between 10 and 20.

Joinplot

It is used to plot two variables with 2D and 1D plots. It basically brings together two different plots. 
Syntax :

 jointplot (x, y [, data, kind, stat_func, ...]) 

Example:

sns.jointplot (x = ` total_bill` , y = `tip` , data = df)

Exit:

sns.jointplot (x = `total_bill` , y = ` tip` , data = df, kind = `kde` )

# KDE shows the density where the points match the most

Explanation :

  • kind — it is a variable that helps us play with how you want to visualize the data. This helps you see what`s going on inside the joinplot. The default is scatter and can be hex, reg (regression), or kde.
  • x and y — two rows, which are column names, and the data that column contains is used by specifying a data parameter.
  • here we can see the y-axis hints and the total x-axis score, as well as the linear relationship between them , which suggests that the total score is increasing along with the tips.
    • Pairplot

      It represents a pair relationship in the entire dataframe and supports an additional argument called hue , for categorical separation. What it does is basically create a joint plot between every possible numeric column and takes some time if the data frame is really huge.

      Syntax :

     pairplot (data [, hue, hue_order, palette,…]) 

    Example :

    sns.pairplot (df, hue = "sex" , palette = `coolwarm` )

    Exit :

    Explanation:

    • hue sets the categorical separation between records in the dataset out.
    • the palette is used to decorate parcels.

    Rugplot

    It displays data points in an array as rods on the axis. As with the distributed schedule, it spans one column. Instead of drawing a histogram, it creates strokes across the entire graph. If you compare it to the connecting plot, you can see that the merged plot counts dashes and shows it as cells.

    Syntax :

     rugplot (a [, height, axis, ax]) 

    Example :

    sns.rugplot (df [ `total_bill` ])

    Exit:

Get Solution for free from DataCamp guru