We can use the Pandas .str accessor , it does fast vectorized string operations on Series and Dataframes and returns a string object. The Pandas str accessor has a number of useful methods, and one of them — str.split
, it can be used with split to get the part of the string you want. To get the n- ю part of a string, first split the column by delimiter and reapply str [n-1] to the returned object, i.e. ... Dataframe.columnName.str.split () .str [n-1]
.
Let’s clarify this with examples.
Code # 1 : Print a split column data object.
import pandas as pd import numpy as np df = pd.DataFrame ({ ’Geek_ID’ : [ ’Geek1_id’ , ’ Geek2_id’ , ’Geek3_id’ , ’Geek4_id’ , ’ Geek5_id’ ], ’Geek_A’ : [ 1 , 1 , 3 , 2 , 4 ], ’Geek_B’ : [ 1 , 2 , 3 , 4 , 6 ], ’Geek_R’ : np.random.randn ( 5 )}) # Geek_A Geek_B Geek_ID Geek_R # 0 1 1 Geek1_id random number # 1 1 2 Geek2_id random number # 2 3 3 Geek3_id random number # 3 2 4 Geek4_id random number # 4 4 6 Geek5_id random number print (df.Geek_ID. str . split ( ’_’ ). str [ 0 ]) |
Output:
0 Geek1 1 Geek2 2 Geek3 3 Geek4 4 Geek5 dtype: object
Code # 2: Print refund list enclosed object data.
import pandas as pd import numpy as np df = pd.DataFrame ({ ’Geek_ID’ : [ ’Geek1_id’ , ’ Geek2_id’ , ’Geek3_id’ , ’Geek4_id’ , ’ Geek5_id’ ], ’Geek_A’ : [ 1 , 1 , 3 , 2 , 4 ], ’Geek_B’ : [ 1 , 2 , 3 , 4 , 6 ], ’Geek_R’ : np.random.randn ( 5 )}) # Geek_A Geek_B Geek_ID Geek_R # 0 1 1 Geek1_id random h islo # 1 1 2 Geek2_id random number # 2 3 3 Geek3_id random number # 3 2 4 Geek4_id random number # 4 4 6 Geek5_id random number print (df.Geek_ID. str . split ( ’_’ ). str [ 0 ]. tolist ()) |
Exit:
[’Geek1’,’ Geek2’, ’Geek3’,’ Geek4’, ’Geek5’]
Code # 3: Print a list of elements.
import pandas as pd import numpy as np df = pd.DataFrame ({ ’Geek_ID’ : [ ’Geek1_id’ , ’ Geek2_id’ , ’Geek3_id’ , ’Geek4_id’ , ’ Geek5_id’ ], ’ Geek_A’ : [ 1 , 1 , 3 , 2 , 4 ] , ’Geek_B’ : [ 1 , 2 , 3 , 4 , 6 ], ’Geek_R’ : np.random.randn ( 5 )}) # Geek_A Geek_B Geek_ID Geek_R # 0 1 1 Geek1_id random number # 1 1 2 Geek2_id random number # 2 3 3 Geek3_id random number # 3 2 4 Geek4_id random number # 4 4 6 Geek5_id random number print (df.Geek_ID. str . split ( ’_’ ). str [ 1 ]. Tolist ()) |
Exit :
[’id’,’ id’, ’id’,’ id’, ’id’]