Change language

Data Analysis and Visualization with Python

|

Installation
The easiest way to install pandas — use pip:

 pip install pandas 

or download it from here

Creating a DataFrame in Pandas

Creating a dataframe is done by passing multiple Series to the DataFrame with using the pd.Series method. Here it is passed in two Series objects, s1 as the first row and s2 as the second row. 
Example:

# assignment of two series s1 and s2

s1 = pd.Series ([ 1 , 2 ])

s2 = pd.Series ([ " Ashish " , "Sid" ])

# crop series objects into data

df = pd.DataFrame ([s1, s2])

# show data frame
df

 
# cropping the data in a different way
# getting the index and column values ​​

dframe = pd.DataFrame ([[ 1 , 2 ], [ "Ashish" , "Sid" ]],

index = [ "r1" , "r2" ],

columns = [ "c1" , "c2" ])  

dframe

 
# crop differently
# dict-like container

dframe = pd.DataFrame ({

"c1" : [ 1 , " Ashish " ],

  " c2 " : [ 2 , "Sid" ]})

dframe

Output:

         

Importing data using pandas

The first step is to read the data. The data is stored as comma separated values ​​or a CSV file, with each row separated by a new line and each column — comma (,). To be able to work with data in Python, you need to read the csv file into the Pandas DataFrame. DataFrame — it is a way of presenting and working with tabular data. Tabular data has rows and columns, just like this CSV file (click Download). 
Example:

# Import pandas library renamed to pd

import pandas as pd

 
# Read the IND_data.csv in the DataFrame assigned to df

df = pd.read_csv ( "IND_data.csv" )

 
# Prints the first 5 lines of the DataFrame by default
df.head ()

  
# of rows and columns of DataFrame
df.shape

Exit d:

     
 29,10 

Indexing data frames with pandas

Indexing is possible with using the pandas.DataFrame.iloc method. The iloc method allows you to get as many rows and columns by position. 
Examples :

# prints the first 5 lines and each column that copies df.head ()

df.iloc [ 0 : 5 ,:]

# prints entire lines and columns
df.iloc [:,:]
# prints 5 lines and first 5 columns

df.iloc [ 5 :,: 5 ]

Indexing using tags in Pandas

For indexing, you can work with tags with using the method pandas.DataFrame.loc  which allows indexing using labels instead of positions. 
Examples:

# prints the first five lines, including the 5th index and all df columns

df.loc [ 0 : 5 ,:]

# prints from the 5th row and whole columns

df = df. loc [ 5 :,:]

The above doesn’t really differ much from df.iloc [0: 5,:]. This is because while the row labels can be anything, our row labels correspond exactly to the positions. But column labels can make working with data a lot easier. Example:

# Prints the first 5 lines of the time period
# value

df.loc [: 5 , "Time period" ]

     

DataFrame Math with pandas

Computing data frames can be done using the statistical functions of the pandas tools. 
Examples:

# calculates various summary statistics excluding NaN values ​​
df .describe ()
# to calculate correlations
df.corr ()
# calculates numeric data ranks
df.rank ()

           

Pandas Plotting

The plots in these examples are created using the standard convention for referencing the matplotlib API, which provides the basics in pandas to easily create decent looking graphs. 
Examples:

# import the required module

import matplotlib.pyplot as plt

# plot histogram

df [ ’Observation Value’ ]. hist (bins = 10 )

  
# indicates a lot of outliers / extremes

df.boxplot (column = ’Observation Value’ , by = ’Time period’ )

  
# drawing points as a scatter plot

x = df [ "Observation Value" ]

y = df [ "Time period" ]

plt.scatter (x, y, label = "stars" , color = "m"

marker = "*" , s = 30 )

X-axis label

plt.xlabel ( ’Observation Value’ )

# frequency tag

plt.ylabel ( ’Time period ’ )

# plot display function
plt.show ()

             

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically