How to randomly select rows from Pandas DataFrame

| | | | | | | | | | | | | | | | | | | | |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

Create a simple data frame with a dictionary of lists.

Mathod # 1 : Using the method

# Import pandas package

import pandas as pd


# Define a dictionary containing employee data

data = { ’Name’ : [ ’ Jai’ , ’Princi’ , ’Gaurav’ , ’ Anuj’ , ’Geeku’ ],

’Age’ : [ 27 , 24 , 22 , 32 , 15 ],

’Address’ : [ ’Delhi’ , ’ Kanpur’ , ’Allahabad’ , ’ Kannauj’ , ’Noida’ ],

’Qualification’ : [ ’ Msc’ , ’MA’ , ’MCA’ , ’ Phd’ , ’10th’ ]}


# Convert dictionary to DataFrame

df = pd.DataFrame (data)


# select all columns
df

# Selects a random line using sample ()
# without specifying any parameters.


# Import pandas package

import pandas as pd


# Define a dictionary containing employee data

data = { ’Name’ : [ ’Jai’ , ’Princi’ , ’ Gaurav’ , ’Anuj’ , ’ Geeku’ ],

’Age’ : [ 27 , 24 , 22 , 32 , 15 ],

’Address’ : [ ’ Delhi’ , ’ Kanpur’ , ’Allahabad’ , ’Kannauj’ , ’ Noida’ ],

’Qualification’ : [ ’ Msc’ , ’MA’ , ’ MCA’ , ’Phd’ , ’ 10th’ ]}


# Convert dictionary to DataFrame

df = pd.DataFrame (data)


# Pick a random line using sample ( )
# without specifying any parameters
df.sample ()

Output:

Example 2. Using the n option, which randomly selects n line numbers.

Select n line numbers at random using sample (n) or sample (n = n) . Each time you run this, you get n different lines.

# To get 3 random lines
# this gives 3 different rows each time


# df .sample (3) or

df.sample (n = 3 )

Output:

Example 3: Using the frac parameter.

You can make part of the axis elements and get lines. For example, if frac = .5 then the fetch method returns 50% of the rows.

# Line fraction


# here you get .50% lines

df.sample (frac = 0.5 )

Output:

Example 4:
First, 70% of the rows of the whole dataframe df are fetched and placed in another df1 dataframe, after which we select 50% frac from df1 .

# line fraction


# here you get 70% of the line from df
# make put in another data frame df1

df1 = df.sample (frac = . 7 )


# Now select 50% of the rows from df1

df1.sample (frac = . 50 )

Output:

Example 5: select multiple lines at random with replace = false

parameter replace d Gives permission to select one row many times (for example). The default value for the replacement parameter of the sample () method — False, so you never select more than the total number of rows.

# Dataframe df only has 4 lines


# if we try to select more than 4 lines, an error will come
# Cannot take a larger sample than the population when & # 39; replace = False & # 39;

df1.sample (n = 3 , replace = False )

Output:

Example 6 Select more than n lines, where n — total number of lines using replace .

# Select more than lines with replacement
# default is False

df1.sample (n = 6 , replace = True )

Output:

Example 7. Using weights

# Weights will be reconfigured automatically

test_weights = [ 0.2 , 0.2 , 0.2 , 0.4 ]

df1.sample (n = 3 , weights = test_weights)

Output:

Example 8: Using an axis

An axis takes a number or a name. The sample () method also allows users to select columns instead of rows using the axis argument.

Output:

Example 9: Using random_state

With a given DataFrame, the sample will always fetch the same rows. If random_state is None or np.random , then a randomly initialized RandomState object is returned.

# Accepts an axis number or name.


# sample also allows users to select columns
# instead of strings using the axis argument.

df1.sample (axis = 0 )

# With this seed the sample will always draw the same lines.


# If random_state is None or np. random,
# then randomly initialized
# RandomState object is returned.

df1.sample (n = 2 , random_state = 2 )

Output:

–° tutorial # 2: Using NumPy

Numpy has chosen how much index to include for random selection, and we can allow replacement.

# Pandas Import and Numpy Package

import numpy as np

import pandas as pd


# Define a dictionary containing employee data

data = { ’Name’ : [ ’Jai’ , ’ Princi’ , ’Gaurav’ , ’ Anu j’ , ’Geeku’ ],

’Age’ : [ 27 , 24 , 22 , 32 , 15 ],

’Address’ : [ ’ Delhi’ , ’Kanpur’ , ’ Allahabad’ , ’Kannauj’ , ’ Noida’ ],

’Qualification’ : [ ’ Msc’ , ’MA’ , ’ MCA’ , ’Phd’ , ’ 10th’ ]}


# Convert dictionary in DataFrame

df = pd. DataFrame (data)


# Choose how much index to include for random selection

chosen_idx = np.random.choice ( 4 , replace = True , size = 6 )

df2 = df.iloc [chosen_idx]


df2

Output:

👻 Read also: what is the best laptop for engineering students?

How to randomly select rows from Pandas DataFrame __del__: Questions

How can I make a time delay in Python?

5 answers

I would like to know how to put a time delay in a Python script.

2973

Answer #1

import time
time.sleep(5)   # Delays for 5 seconds. You can also use a float value.

Here is another example where something is run approximately once a minute:

import time
while True:
    print("This prints once a minute.")
    time.sleep(60) # Delay for 1 minute (60 seconds).

2973

Answer #2

You can use the sleep() function in the time module. It can take a float argument for sub-second resolution.

from time import sleep
sleep(0.1) # Time in seconds

How to randomly select rows from Pandas DataFrame __del__: Questions

How to delete a file or folder in Python?

5 answers

How do I delete a file or folder in Python?

2639

Answer #1


Path objects from the Python 3.4+ pathlib module also expose these instance methods:

We hope this article has helped you to resolve the problem. Apart from How to randomly select rows from Pandas DataFrame, check other __del__-related topics.

Want to excel in Python? See our review of the best Python online courses 2023. If you are interested in Data Science, check also how to learn programming in R.

By the way, this material is also available in other languages:



Carlo OConnell

Boston | 2023-02-02

Thanks for explaining! I was stuck with How to randomly select rows from Pandas DataFrame for some hours, finally got it done 🤗. I just hope that will not emerge anymore

Dmitry Schteiner

New York | 2023-02-02

I was preparing for my coding interview, thanks for clarifying this - How to randomly select rows from Pandas DataFrame in Python is not the simplest one. Will get back tomorrow with feedback

Ken Innsbruck

Prague | 2023-02-02

Simply put and clear. Thank you for sharing. How to randomly select rows from Pandas DataFrame and other issues with re Python module was always my weak point 😁. I am just not quite sure it is the best method

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically