Replace values ​​in Pandas dataframe with regular expressions

| | | | | | | | | | | | | | | | | | | | |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

We already discussed in the previous article, how to replace some known string values ​​in a data frame . In this post, we will use regular expressions to replace strings that have some pattern.

Problem # 1: You are presented with a dataframe that contains detailed information about various events in different cities. For those cities that start with the keyword "New" or "New", change it to "New".

Solution: we are going to use a regular expression to detect such names, and then we’ll use Dataframe.replace () to replace those names.

# import pandas as pd

import pandas as pd


# Let’s create the Dataframe

df = pd.DataFrame ({ ’ City’ : [ ’New York’ , ’ Parague’ , ’New Delhi’ , ’Venice’ , ’ new Orleans ’ ],

’ Event’ : [ ’Music’ , ’Poetry’ , ’ Theater’ , ’Comedy’ , ’ Tech_Summit’ ],

’ Cost’ : [ 10000 , 5000 , 15000 , 2000 , 12000 ]})


# Let’s create an index

index_ = [pd.Period ( ’02-2018’ ), pd.Period ( ’ 04-2018’ ),

pd.Period ( ’06-2018’ ), pd.Period ( ’ 10-2018’ ), pd.Period ( ’12-2018’ )]


# Set index

df.index = index_


# Print the data frame < / p>

print (df)

Output:

We will now write a regular expression to match the string, and then we will use Dataframe.replace () to replace those names.

# replace the corresponding lines

df_updated = df.replace (to_replace = ’[nN] ew’ , value = ’New_’ , regex = True )


# Print the updated data frame

print (df_updated)

Output:

As we can see from the output, old lines have been successfully replaced with new ones.

Problem # 2: You are provided with a dataframe that contains detailed information about various events in different cities. Certain city names contain some additional details in parentheses. Look for such names and remove additional details.

Solution: For this task, we will write our own custom function using regular expressions to define and update the names of these cities. In addition, we will use Dataframe.apply () to apply our custom function to each column value.

# import pandas as pd

import pandas as pd


# Let’s create the Dataframe

df = pd.DataFrame ({ ’City’ : [ ’ New York (City) ’ , ’ Parague’ , ’New Delhi (Delhi)’ , ’Venice’ , ’ new Orleans’ ],

’ Event’ : [ ’ Music’ , ’Poetry’ , ’Theater’ , ’Comedy’ , ’ Tech_Summit’ ],

’Cost’ : [ 10000 , 5000 , 15000 , 2000 , 12000 ]})


# Let’s go Let’s not create an index

index_ = [pd .Period ( ’02-2018’ ), pd.Period ( ’04-2018’ ),

pd.Period ( ’06-2018’ ), pd.Period ( ’ 10-2018’ ), pd.Period ( ’12-2018’ )]


# Set index

df.index = index_


# Print the data frame

print (df)

Output:

We will now write our own customized function to match the description in city names.

# Import re-batch to use regular expressions

import re


# Function to clean up names

def Clean_names (City_name):

# Search for open parenthesis in title with followed by

# any characters are repeated any number of times

if re.search ( ’ (. * ’ , City_name):

# Retrieve the position of the beginning of the pattern

pos = re.search ( ’(. * ’ , City_name) .start ()

# return the cleared name

return City_name [: pos]

else :

# if cleanup is required, return same name

return City_name


# Updated city columns

df [ ’City’ ] = df [ ’City ’ ]. apply (Clean_names)


# Print updated data frame

print (df)

Output:

👻 Read also: what is the best laptop for engineering students?

Replace values ​​in Pandas dataframe with regular expressions __del__: Questions

How can I make a time delay in Python?

5 answers

I would like to know how to put a time delay in a Python script.

2973

Answer #1

import time
time.sleep(5)   # Delays for 5 seconds. You can also use a float value.

Here is another example where something is run approximately once a minute:

import time
while True:
    print("This prints once a minute.")
    time.sleep(60) # Delay for 1 minute (60 seconds).

2973

Answer #2

You can use the sleep() function in the time module. It can take a float argument for sub-second resolution.

from time import sleep
sleep(0.1) # Time in seconds

Replace values ​​in Pandas dataframe with regular expressions __del__: Questions

How to delete a file or folder in Python?

5 answers

How do I delete a file or folder in Python?

2639

Answer #1


Path objects from the Python 3.4+ pathlib module also expose these instance methods:

We hope this article has helped you to resolve the problem. Apart from Replace values ​​in Pandas dataframe with regular expressions, check other __del__-related topics.

Want to excel in Python? See our review of the best Python online courses 2023. If you are interested in Data Science, check also how to learn programming in R.

By the way, this material is also available in other languages:



Angelo Ungerschaft

Milan | 2023-04-01

Thanks for explaining! I was stuck with Replace values ​​in Pandas dataframe with regular expressions for some hours, finally got it done 🤗. I just hope that will not emerge anymore

Manuel Krasiko

Massachussetts | 2023-04-01

Simply put and clear. Thank you for sharing. Replace values ​​in Pandas dataframe with regular expressions and other issues with exp was always my weak point 😁. I just hope that will not emerge anymore

Angelo Chamberlet

Massachussetts | 2023-04-01

Maybe there are another answers? What Replace values ​​in Pandas dataframe with regular expressions exactly means?. Will get back tomorrow with feedback

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically