# Data visualization with various graphs in Python

Python Methods and Functions

Consider this dataset, for which we will build various charts:

Different types of graphs for data analysis and presentation

one. Histogram:
A histogram represents the frequency of occurrence of certain phenomena that lie within a certain range of values ​​and are located in sequential and fixed intervals.

In the code below, the histogram for ` Age. Income, Sales `. So these graphs in the output show the frequency of each unique value for each attribute.

 ` # pandas and matplotlib import ` ` import ` ` pandas as pd ` ` import ` ` matplotlib.pyplot as plt `   ` # create 2D array of the table above ` ` data ` ` = ` ` [[` ` 'E001' ` `, ` `' M' ` `, ` ` 34 ` `, ` ` 123 ` `, ` ` 'Normal' ` `, ` ` 350 ` `], ` ` [` `' E002' ` `, ` ` 'F' ` `, ` ` 40 ` `, ` ` 114 ` `, ` ` 'Overweight' ` `, ` ` 450 ` `], ` ` ` ` [ ` ` 'E003' ` `, ` `' F' ` `, ` ` 37 ` `, ` ` 135 ` `, ` ` 'Obesity' ` `, ` ` 169 ` `], ` ` [ ` ` 'E004' ` `, ` `' M' ` `, ` ` 30 ` `, ` ` 139 ` `, ` ` 'Underweight' ` `, ` ` 189 ` `], ` ` [` ` 'E005' ` `, ` `' F' ` `, ` ` 44 ` `, ` ` 117 ` `, ` `' Underweight' ` `, ` ` 183 ` `], ` ` [` ` 'E006' ` `, ` `' M' ` `, ` ` 36 ` `, ` ` 121 ` `, ` ` 'Normal' ` `, ` ` 80 ` `], ` ` [` ` 'E007' ` `, ` `' M' ` `, ` ` 32 ` `, ` ` 133 ` `, ` `' Obesity' ` `, ` ` 166 ` `], ` ` [` ` 'E008' ` `, ` `' F' ` `, ` ` 26 ` `, ` ` 140 ` `, ` `' Normal' ` `, ` ` 120 ` `], ` ` [` ` 'E009' ` `, ` `' M' ` `, ` ` 32 ` `, ` ` 133 ` `, ` ` 'Normal' ` `, ` ` 75 ` `], ` ` ` ` [` ` 'E010' ` `, ` `' M' ` `, ` ` 36 ` `, ` ` 133 ` `, ` ` 'Underweight' ` `, ` ` 40 ` `]] `   ` # dataframe was created with ` ` # the above dataset ` ` df ` ` = ` ` pd.DataFrame (data, columns ` ` = ` ` [` ` 'EMPID' ` `, ` `' Gender' ` ` , ` ` 'Age' ` `, ` ` 'Sales' ` `, ` ` ` ` 'BMI' ` `, ` `' Income' ` `]) ` ` `  ` # create a bar chart for numeric data ` ` d f.hist () `   ` # show plot ` ` plt.show () `

Output:

2. Bar Chart:
A Bar Chart is used to show comparisons between different attributes, or it can show comparisons of elements over time.

 ` # This uses the dataframe of the previous code `   ` # Build a histogram for numeric values ​​` ` # comparison will be shown between ` ` # all 3 ages, income, sales ` ` df.plot.bar () `   ` # section between 2 attributes ` ` plt.bar (df [` ` 'Age' ` `], df [` ` 'Sales' ` `]) ` ` plt. xlabel (` ` "Age" ` `) ` ` plt .ylabel (` ` "Sales" ` `) ` ` plt.show () `

Output:

3. Box Plot:
A box plot is a graphical representation of statistics based on ` minimum, first quartile, median, third quartile, and maximum `. The term "box plot" comes from the fact that the plot looks like a rectangle with lines going up and down. Because of the stretching lines, this type of plot is sometimes referred to as a box and whisker plot. Quantile and median refer to this quantile and median .

 ` # For each numeric dataframe attribute ` ` df.plot.box () `   ` # single window attribute ` ` plt.boxplot (df ​​[` ` 'Income' ` `]) ` ` plt.show () `

Output:

4. Pie Chart:
A pie chart shows a static number and how categories represent part of a whole or a composition of something. A pie chart represents numbers as a percentage, and the sum of all segments must be 100%.

 ` plt.pie (df [` `' Age' ` `], labels ` ` = ` ` {` ` "A" ` `, ` ` "B" ` `, ` `" C "` `, ` ` ` ` "D" ` `, ` ` "E" ` `, ` ` "F" ` `, ` ` "G" ` `, ` ` "H" ` `, ` ` "I" ` `, ` ` "J" ` `}, ` ` `  ` autopct ` ` = ` ` '% 1.1f %% '` `, shadow ` ` = ` ` True ` `) ` ` plt.show () `   ` plt.pie (df [` ` 'Income' ` `], labels ` ` = ` ` {` `" A "` `, ` ` "B" ` `, ` ` " C "` `, ` ` ` `" D " ` `, ` ` "E" ` `, ` `" F " , `` "G" , "H" , " I " , " J " },   autopct = '% 1.1f %%' , shadow = True ) plt.show ()   plt.pie (df [ 'Sales' ], labels = { "A" , "B" , "C" , "D" , "E" , "F" , "G" , "H" , "I" , "J" }, autopct = '% 1.1f %%' , shadow = True ) plt.show () `

Output:

5. Plot Scatter:
A scatter plot shows the relationship between two different variables and can reveal distribution trends. It should be used when there are many different data points and you want to highlight the similarities in the dataset. This is useful when looking for outliers and understanding the distribution of your data.

 ` # graph spread between income and age ` ` plt.scatter (df [` ` 'income' ` `], df [` ` 'age' ` `]) ` ` plt.show () `   ` # graph spread between revenue and sales ` ` plt.scatter (df [` ` 'income' ` `], df [` ` 'sales' ` `]) ` ` plt.show () `   ` # graph spread between sales and age ` ` plt.sc atter (df [` ` 'sales' ` `], df [` `' age' ` `]) ` ` plt.show () `

Output: