  Data visualization with various graphs in Python

Python Methods and Functions

Consider this dataset, for which we will build various charts:

Different types of graphs for data analysis and presentation

one. Histogram:
A histogram represents the frequency of occurrence of certain phenomena that lie within a certain range of values ​​and are located in sequential and fixed intervals.

In the code below, the histogram for Age. Income, Sales . So these graphs in the output show the frequency of each unique value for each attribute.

 # pandas and matplotlib import import pandas as pd import matplotlib.pyplot as plt   # create 2D array of the table above data = [[ 'E001' , ' M' , 34 , 123 , 'Normal' , 350 ], [ ' E002' , 'F' , 40 , 114 , 'Overweight' , 450 ],   [ 'E003' , ' F' , 37 , 135 , 'Obesity' , 169 ], [  'E004' , ' M' , 30 , 139 , 'Underweight' , 189 ], [ 'E005' , ' F' , 44 , 117 , ' Underweight' , 183 ], [ 'E006' , ' M' , 36 , 121 , 'Normal' , 80 ], [ 'E007' , ' M' , 32 , 133 , ' Obesity' , 166 ], [ 'E008' , ' F' , 26 , 140 , ' Normal' , 120 ], [ 'E009' , ' M' , 32 , 133 , 'Normal' , 75 ],   [ 'E010' , ' M' , 36 , 133 , 'Underweight' , 40 ]]   # dataframe was created with # the above dataset df = pd.DataFrame (data, columns = [ 'EMPID' , ' Gender' ,  'Age' , 'Sales' ,   'BMI' , ' Income' ])    # create a bar chart for numeric data d f.hist ()   # show plot plt.show ()

Output: 2. Bar Chart:
A Bar Chart is used to show comparisons between different attributes, or it can show comparisons of elements over time.

 # This uses the dataframe of the previous code   # Build a histogram for numeric values ​​ # comparison will be shown between # all 3 ages, income, sales df.plot.bar ()   # section between 2 attributes plt.bar (df [ 'Age' ], df [ 'Sales' ]) plt. xlabel ( "Age" ) plt .ylabel ( "Sales" ) plt.show ()

Output:  3. Box Plot:
A box plot is a graphical representation of statistics based on minimum, first quartile, median, third quartile, and maximum . The term "box plot" comes from the fact that the plot looks like a rectangle with lines going up and down. Because of the stretching lines, this type of plot is sometimes referred to as a box and whisker plot. Quantile and median refer to this quantile and median .

 # For each numeric dataframe attribute df.plot.box ()   # single window attribute plt.boxplot (df ​​[ 'Income' ]) plt.show ()

Output:  4. Pie Chart:
A pie chart shows a static number and how categories represent part of a whole or a composition of something. A pie chart represents numbers as a percentage, and the sum of all segments must be 100%.

 plt.pie (df [ ' Age' ], labels = { "A" , "B" , " C " ,   "D" , "E" , "F" , "G" , "H" , "I" , "J" },    autopct = '% 1.1f %% ' , shadow = True ) plt.show ()   plt.pie (df [ 'Income' ], labels = { " A " , "B" , " C " ,   " D " , "E" , " F " , "G" , "H" , " I " , " J " },   autopct = '% 1.1f %%' , shadow = True ) plt.show ()   plt.pie (df [ 'Sales' ], labels = { "A" , "B" , "C" , "D" , "E" , "F" , "G" , "H" , "I" , "J" }, autopct = '% 1.1f %%' , shadow = True ) plt.show ()

Output: 5. Plot Scatter:
A scatter plot shows the relationship between two different variables and can reveal distribution trends. It should be used when there are many different data points and you want to highlight the similarities in the dataset. This is useful when looking for outliers and understanding the distribution of your data.

 # graph spread between income and age plt.scatter (df [ 'income' ], df [ 'age' ]) plt.show ()   # graph spread between revenue and sales plt.scatter (df [ 'income' ], df [ 'sales' ]) plt.show ()   # graph spread between sales and age plt.sc atter (df [ 'sales' ], df [ ' age' ]) plt.show ()

Output: 