In Pandas, missing data is represented by two values:
None: None — it is a Python singleton object and is often used for missing data in Python code.
NaN: NaN (short for Not a Number) — it is a special floating point value recognized by all systems that use the IEEE standard floating point notation.
Pandas consider None and NaN essentially interchangeable to indicate missing or null values. To facilitate this convention, the Pandas DataFrame has several useful functions for detecting, removing and replacing empty values:
In this article, we are using a CSV file, to load the CSV file we are using, click here .
Check missing values using isnull () and notnull()
To check for missing values in the Pandas DataFrame, we use the isnull () function and notnull () . Both functions help to check if the value is NaN or not. These functions can also be used in the Pandas series to find null values in a series.
Check for missing values with isnull ()
To check for null values in a Pandas DataFrame, we use isnull () this function returns a data frame with Boolean values equal to True for NaN values.
# data filtering # display data only with Gender = NaN data [bool_series]
Output: As shown in the output image , only rows that have Gender = NULL are displayed.
Check for missing values using notnull ()
To check for null values in Pandas Dataframe, we use the notnull () function, this function returns a data frame with boolean values that are False for NaN values.
# filtering data # display data only with Gender = Not NaN data [bool_series]
Output: As shown in the output image, only strings that have Gender = NOT NULL are displayed.
Filling in missing values with fillna () , replace () and interpolate()
To fill in null values in datasets, we use fillna () , replace () and interpolate () these functions replace NaN values with some native value. All of these functions help fill in null values in DataFrame datasets. The Interpolate () function is mainly used to fill in NA values in a data frame, but it uses various interpolation techniques to fill in missing values rather than hardcoding the value.
As we can see in the output, the values in the first line cannot be filled, since the direction of filling the values is direct , and there is no previous value to interpolate.
Remove missing values with dropna()
To remove null values from a dataframe, we used dropna () this function by dropna () rows / columns of datasets with Null values.
Code # 1: deleting rows with at least 1 null value.