Data visualization with various graphs in Python

Python Methods and Functions

Consider this dataset, for which we will build various charts:

Different types of graphs for data analysis and presentation

one. Histogram:
A histogram represents the frequency of occurrence of certain phenomena that lie within a certain range of values ​​and are located in sequential and fixed intervals.

In the code below, the histogram for Age. Income, Sales . So these graphs in the output show the frequency of each unique value for each attribute.

# pandas and matplotlib import

import pandas as pd

import matplotlib.pyplot as plt

 
# create 2D array of the table above

data = [[ 'E001' , ' M' , 34 , 123 , 'Normal' , 350 ],

[ ' E002' , 'F' , 40 , 114 , 'Overweight' , 450 ],

  [ 'E003' , ' F' , 37 , 135 , 'Obesity' , 169 ],

'E004' , ' M' , 30 , 139 , 'Underweight' , 189 ],

[ 'E005' , ' F' , 44 , 117 , ' Underweight' , 183 ],

[ 'E006' , ' M' , 36 , 121 , 'Normal' , 80 ],

[ 'E007' , ' M' , 32 , 133 , ' Obesity' , 166 ],

[ 'E008' , ' F' , 26 , 140 , ' Normal' , 120 ],

[ 'E009' , ' M' , 32 , 133 , 'Normal' , 75 ],

  [ 'E010' , ' M' , 36 , 133 , 'Underweight' , 40 ]]

 
# dataframe was created with
# the above dataset

df = pd.DataFrame (data, columns = [ 'EMPID' , ' Gender'

'Age' , 'Sales' ,

  'BMI' , ' Income' ])

  
# create a bar chart for numeric data
d f.hist ()

 
# show plot
plt.show ()

Output:

2. Bar Chart:
A Bar Chart is used to show comparisons between different attributes, or it can show comparisons of elements over time.

# This uses the dataframe of the previous code

 
# Build a histogram for numeric values ​​
# comparison will be shown between
# all 3 ages, income, sales
df.plot.bar ()

 
# section between 2 attributes

plt.bar (df [ 'Age' ], df [ 'Sales' ])

plt. xlabel ( "Age" )

plt .ylabel ( "Sales" )

plt.show ()

Output:

3. Box Plot:
A box plot is a graphical representation of statistics based on minimum, first quartile, median, third quartile, and maximum . The term "box plot" comes from the fact that the plot looks like a rectangle with lines going up and down. Because of the stretching lines, this type of plot is sometimes referred to as a box and whisker plot. Quantile and median refer to this quantile and median .

# For each numeric dataframe attribute
df.plot.box ()

 
# single window attribute

plt.boxplot (df ​​[ 'Income' ])

plt.show ()

Output:

4. Pie Chart:
A pie chart shows a static number and how categories represent part of a whole or a composition of something. A pie chart represents numbers as a percentage, and the sum of all segments must be 100%.

plt.pie (df [ ' Age' ], labels = { "A" , "B" , " C " ,

  "D" , "E" , "F" ,

"G" , "H" , "I" , "J" },

  

autopct = '% 1.1f %% ' , shadow = True )

plt.show ()

 

plt.pie (df [ 'Income' ], labels = { " A " , "B" , " C " ,

  " D " , "E" , " F " ,

"G" , "H" , " I " , " J " },

 

autopct = '% 1.1f %%' , shadow = True )

plt.show ()

 

plt.pie (df [ 'Sales' ], labels = { "A" , "B" , "C" ,

"D" , "E" , "F" ,

"G" , "H" , "I" , "J" },

autopct = '% 1.1f %%' , shadow = True )

plt.show ()

Output:

 

5. Plot Scatter:
A scatter plot shows the relationship between two different variables and can reveal distribution trends. It should be used when there are many different data points and you want to highlight the similarities in the dataset. This is useful when looking for outliers and understanding the distribution of your data.

# graph spread between income and age

plt.scatter (df [ 'income' ], df [ 'age' ])

plt.show ()

 
# graph spread between revenue and sales

plt.scatter (df [ 'income' ], df [ 'sales' ])

plt.show ()

 
# graph spread between sales and age

plt.sc atter (df [ 'sales' ], df [ ' age' ])

plt.show ()

Output: