Change language

# Seaborn | Categorical plots

| |

Seaborn besides being a statistical plotting library, it also provides some default datasets. We will use one such default dataset called "hints". The Clues dataset contains information about people who likely had a meal in the restaurant, whether they tip the waiters, their gender, whether they smoke, and so on.

Let’s take a look at the set these tips.

Code

 # import marine librarian import seaborn as sns   # import made to avoid warnings from warnings import filterwarnings    # read dataset df = sns.load _dataset ( ’tips’ )    # first five records if dataset df.head ()

Now let’s move on to the graphs so we can see how we can visualize these categorical variables.

## Barplot

Column Chart is mainly used to aggregate categorical data according to some methods and is the default as the average. It can also be understood as visualizing the group by action. To use this graph, we select a categorical bar for the x-axis and a numeric bar for the y-axis and see that it creates a graph that takes the average of each categorical bar.
Syntax :

barplot ([x, y, hue, data, order, hue_order,…])

Example:

 # set scene background sns.set_style ( ’darkgrid’ )    # plot using the default evaluator average sns.barplot (x = ’sex’ , y = ’ total_bill’ , data = df, palette = ’plasma’ )    # or import numpy as np   # change grade from average to standard rejection sns.barplot (x = ’sex’ , y = ’ total_bill’ , data = df,  palette = ’plasma’ , estimator = np.std)

Exit:

Explanation / Analysis
Looking at the plot, we can say that the average total_bill for a man is larger than for a woman.

• a palette is used to set the color of the plot
• The evaluator is used as a statistical function for the score in each categorical bin.
• ## Countplot

A counting graph basically counts the categories and returns the number of their cases. This is one of the simplest plots provided by the Seaborn library.

Syntax :

countplot ([x, y, hue, data, order,…])

Example :

 sns.countplot (x = ’sex’ , data = df)

Exit :

Explanation / Analysis
Looking at the graph, we can say that there are more males than females in the dataset. Since it only returns a quantity based on a categorical column, we only need to specify the x parameter.

## Boxplot

A box plot is sometimes called a truncated plot. It shows the distribution of quantitative data that represent comparisons between variables. at the checkpoint, the quartiles of the dataset are shown, and the whiskers are extended to show the rest of the distribution, that is, the points indicating the presence of outliers.

Syntax :

boxplot ([x, y, hue, data, order, hue_order,…])

Example :

 sns.boxplot (x = ’day’ , y = ’total_bill’ , data = df, hue = ’ smoker’ )

Output:

Explanation / Analysis —
x takes a category column and y — numeric column. Thus, we see the total bill spent for each day. The hue parameter is used to further add categorical separation. Looking at the plot, we can say that people who do not smoke had a higher score on Friday compared to people who smoked.

## Violinplot

It looks like a roadblock. except that it provides taller, more advanced rendering and uses a kernel density estimate to give a better description about the distribution of the data.

Syntax :

violinplot ([x, y, hue, data, order,…])

Example :

 sns.violinplot (x = ’day’ , y = ’total_bill’ , data = df, hue = ’sex’ , split = True )

Exit:

Explanation / Analysis —

• hue is used for further splitting data using gender category
• setting split = True will draw half a violin for each level. This can make it easier to compare the distributions directly.
• ## Stripplot

This basically creates a scatter plot based on the category.

Syntax:

stripplot ([x, y, hue, data, order,…])

Example :

 sns.stripplot (x = ’ day’ , y = ’total_bill’ , data = df,  jitter = True , hue = ’smoker’ , dodge = True )

Exit:

Explanation / Analysis —

• One problem with a bar chart is that you cannot tell for sure which points are superimposed and so we use the jitter parameter to add random noise.
• The jitter parameter is used to add an amount of jitter (along the categorical axis only) which can be useful when you have many points, and they overlap to make the distribution easier to see.
• hue is used to provide additional categorical separation
• The split = True parameter is used to draw separate bar graphs based on the category specified by the hue parameter.

## Factorplot

He is the most common of all these plots and provides an option called a view to select the type of plot we want, thus saving us from having to write these plots separately. Parameter type can be bar, violin, swarm, etc.

Syntax :

sns.factorplot ([x, y, hue, data, row, col,…])

Example :

 sns.factorplot (x = ’day’ , y = ’total_bill’ , data = df, kind = ’bar’ )

Exit:

## Shop

Learn programming in R: courses

\$

Best Python online courses for 2022

\$

Best laptop for Fortnite

\$

Best laptop for Excel

\$

Best laptop for Solidworks

\$

Best laptop for Roblox

\$

Best computer for crypto mining

\$

Best laptop for Sims 4

\$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

NUMPYNUMPY

How to specify multiple return types using type-hints

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

NUMPYNUMPY

glob exclude pattern

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

NUMPYNUMPY

Python CSV error: line contains NULL byte

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

## Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries