Change language

Conversion Functions in Pandas DataFrame

Python — great language for data analysis, thanks primarily to the fantastic ecosystem of data-centric Python packages. Pandas is one such package and makes it much easier to import and analyze data. In this article, we are using “ nba.csv ” file to upload CSV, click here.

Cast panda object to specified type

The DataFrame.astype () function is used to cast the pandas object to the specified dtype.  astype () also provides the ability to convert any suitable existing column to a categorical type.

Code # 1: convert the data type of the Weight column.

# import pandas as pd

import pandas as pd

 
# Create data frame from CSV file

df = pd.read_csv ( "nba.csv" )

 
# Print the first 10 lines
# data frame for rendering

 

df [: 10 ]

Since the data has some "nan" values, to avoid any error we will discard all lines containing any nan values.

# discard all those lines
# is & # 39; nan & # 39 ;.

df.dropna (inplace = True )

# let’s figure out the data type of the W column eight

before = type (df.Weight [ 0 ])

 
# We are now converting it to int64.

df.Weight = df.We "strong" ight.astype ( ’int64’ )

  
# let’s figure out the data type after casting

after = type (df.Weight [ 0 ])

  
# print value before
before

  
# output value after
after

Output:

# print the data frame and see
# as this looks after change
df

Suggest a better data type for the column input object

# import pandas as pd

import pandas as pd

 
# Create a data frame

df = pd.DataFrame ({ "A" : [ "sofia" , 5 , 8 , 11 , 100 ],

 

"B" : [ 2 , 8 , 77 , 4 , 11 ],

"C" : [ "amy" , 11 , 4 , 6 , 9 ]})

  
# Print the data frame

print (df)

Output:

Let’s see the dtype (data type) of each column in the data frame.

# print basic information
df.info ()

As we can see in the output, the first and third columns are of type object . whereas the second column is of type int64 . Now slice the dataframe and create a new dataframe from it.

# cut from 1st row to end

df_new = df [ 1 :]

 
# Let’s print a new data frame
df_new

 
# Now let’s print the data type of the columns
df_new.info ()

Output:

As we see in the output, the columns“ A "and" C "are of object type even if they contain an integer value. So let’s try infer_objects () .

# using infer_objects () function.

df_new = df_new.infer_objects ()

 
# Print dtype after function is applied
df_new. info ()

Output:

Now if we look at the d type of each column, we can see that columns "A" and " C "is now int64 of type int64 .

Detect missing values ​​

DataFrame.isna () function used Used to detect missing values. It returns a boolean of the same size indicating whether the values ​​are NA. NA values ​​such as None or numpy.NaN are mapped to True values. Everything else is matched against false values. Characters such as blank lines "or numpy.inf are not considered NA values ​​(unless you have set pandas.options.mode.use_inf_as_na = True).

Code # 1: Use the isna () function to detect missing values ​​in the data frame.

# import pandas as pd

import pandas as pd

  
# Create data frame

df = pd.read_csv ( "nba.csv" )

 
# Print the data frame
df

Let’s use the isna () function to find missing values.

# detect missing values ​​
df.isna ()

Output:

In the output, cells corresponding to missing values ​​are true, otherwise false.

Find existing / not missing values ​​

The DataFrame.notna () function detects existing / non-missing values ​​in the dataframe. The function returns a boolean that has the same size as the object to which it is applied, indicating whether each individual value is n or not. All non-missing values ​​are displayed as true and missing values ​​are displayed as false.

Code # 1: Use notna () to find all non-missing values in the data frame.

# import pandas as pd

import pandas as pd

 
# Create first data frame

df = pd. DataFrame ({ "A" : [ 14 , 4 , 5 , 4 , 1 ],

"B" : [ 5 , 2 , 54 , 3 , 2 ], 

"C" : [ 20 , 20 , 7 , 3 , 8 ],

"D" : [ 14 , , 6 , 2 , 6 ]})

 
# Print the data frame

print (df)

Let’s use the dataframe.notna () function to find all non-missing values ​​in the data frame.

# find non-values ​​
df. notna ()

Output:

 As we can see in the output, all non-missing values ​​in the data frame have been matched against true. No falsy value because there is no missing value in the data frame.

Conversion methods to DataFrame

Function Description
DataFrame.convert_objects () Attempt to infer better dtype for object columns.
DataFrame.copy() Return a copy of this object’s indices and data.
DataFrame.bool ( ) Return the bool of a single element PandasObject.

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically