Python | Pandas Series.str.extractall ()

Python Methods and Functions | Regular Expressions

Series.str can be used to access the values ​​of a series as strings and apply multiple methods to it. Series.str.extractall() Pandas Series.str.extractall() is used to extract capture groups in regular expression as columns in a DataFrame. For each subject line in the Series, extract the groups from all matches of the pat regexp. When each subject line in the Series has exactly one match, extractall (pat) .xs (0, level = & # 39; match & # 39;) matches extract (pat).

Syntax: Series.str.extractall (pat, flags = 0)

Parameter:
pat: Regular expression pattern with capturing groups.
flags: A re module flag, for example re.IGNORECASE.

Returns: DataFrame

Example # 1: Use Series.str.extractall () to extract all groups from a row in the underlying data of a given series object.

# import pandas as pd

import pandas as pd

 
# import re for regular expressions

import re

 
# Create series

sr = pd.Series ([ 'New_York' , ' Lisbon' , 'Tokyo' , 'Paris' , ' Munich' ])

 
# Create index

idx = [ 'City 1 ' , ' City 2' , 'City 3' , ' City 4' , 'City 5' ]

 
# set index

sr.index = idx

 
# Print series

print (sr)

Output:

Now we will use Series .str.extractall () to extract all groups from strings in a given series object.

# fetch all groups that have a vowel, then
# any character

result = sr. str . extractall (pat = '([aeiou].)' )

 
# print the result

print ( result)

Output:

As we can see in the output, Series.str.extractall () returned a data frame containing a column of the entire extracted group.

Example # 2: Use Series.str.extractall () to extract all groups from a row in the underlying data of this series object.

# import pandas as pd

import pandas as pd

 
# re import for regular expressions

import re

 
# Create a series

sr = pd.Series ([ 'Mike' , 'Alessa' , ' Nick' , 'Kim' , 'Britney' ])

  
# Create index

idx = [ 'Name 1' , ' Name 2' , 'Name 3' , ' Name 4' , 'Name 5' ]

  
# set index

sr.index = idx

 
# Print series

print (sr)

Output:

Now we will use b Series.str.extractall () to extract all groups from strings in a given series object.

# retrieve all groups that have capital letters
# followed by & # 39; i & # 39; and any other character

result = sr . str . extractall (pat = '([AZ] i.)' )

  
# print the result

print (result )

Output:

As we can see in the output, Series.str.extractall () returned a data frame containing a column of the entire extracted group.





Get Solution for free from DataCamp guru