Pandas describe()
are used to view some basic statistics such as percentile, mean, standard deviation, etc. of a data frame or series of numeric values. When this method is applied to a sequence of lines, it returns different output as shown in the examples below.
Syntax: DataFrame.describe (percentiles = None, include = None, exclude = None)
Parameters:
percentile: list like data type of numbers between 0-1 to return the respective percentile
include: List of data types to be included while describing dataframe. Default is None
exclude: List of data types to be Excluded while describing dataframe. Default is NoneReturn type: Statistical summary of data frame.
To load the dataset used in the following example, press here.
In the following examples, the data frame used contains data for some NBA players. An image of the data frame before any operations is attached below.
Example # 1: Describing a data frame as an object, and with a numeric data type
This example describes a data frame and passes [& # 39; object & # 39;] to include an option to see a description of a series of objects. [.20, .40, .60, .80] is passed to the percentile parameter to view the corresponding percentile of the number series.
< code class = "undefined spaces"> |
Output:
As shown on the output image, a statistical description of the data frame is returned with the corresponding missing percentiles. For columns with strings, NaN was returned for numeric operations.
Example # 2: Description of a series of lines
This example calls the description method on the Name column to see behavior with the object’s data type.
|
Output:
As shown in the output image, the behavior of description () differs from a sequence of lines.
In this case, various characteristics were returned, such as number of values, unique values, top and frequency of occurrence.