Split column in Pandas info frame and get part of it



We can use the Pandas .str accessor , it does fast vectorized string operations on Series and Dataframes and returns a string object. The Pandas str accessor has a number of useful methods, and one of them —  str.split , it can be used with split to get the part of the string you want. To get the n- ю part of a string, first split the column by delimiter and reapply str [n-1] to the returned object, i.e. …  Dataframe.columnName.str.split () .str [n-1] .

Let`s clarify this with examples.

Code # 1 : Print a split column data object.

import pandas as pd

import numpy as np

df = pd.DataFrame ({ `Geek_ID` : [ `Geek1_id` , ` Geek2_id` , `Geek3_id`

`Geek4_id` , ` Geek5_id` ],

`Geek_A` : [ 1 , 1 , 3 , 2 , 4 ],

`Geek_B` : [ 1 , 2 , 3 , 4 , 6 ],

`Geek_R` : np.random.randn ( 5 )})

  
# Geek_A Geek_B Geek_ID Geek_R
# 0 1 1 Geek1_id random number
# 1 1 2 Geek2_id random number
# 2 3 3 Geek3_id random number
# 3 2 4 Geek4_id random number
# 4 4 6 Geek5_id random number

 

print (df.Geek_ID. str . split ( `_` ). str [ 0 ])

Output:

 0 Geek1 1 Geek2 2 Geek3 3 Geek4 4 Geek5 dtype: object 

Code # 2: Print refund list enclosed object data.

import pandas as pd

import numpy as np

df = pd.DataFrame ({ `Geek_ID` : [ `Geek1_id` , ` Geek2_id` , `Geek3_id` ,

`Geek4_id` , ` Geek5_id` ],

`Geek_A` : [ 1 , 1 , 3 , 2 , 4 ],

`Geek_B` : [ 1 , 2 , 3 , 4 , 6 ],

`Geek_R` : np.random.randn ( 5 )})

 
# Geek_A Geek_B Geek_ID Geek_R
# 0 1 1 Geek1_id random h islo
# 1 1 2 Geek2_id random number
# 2 3 3 Geek3_id random number
# 3 2 4 Geek4_id random number
# 4 4 6 Geek5_id random number

 

print (df.Geek_ID. str . split ( `_` ). str [ 0 ]. tolist ())

Exit:

 [`Geek1`,` Geek2`, `Geek3`,` Geek4`, `Geek5`] 

Code # 3: Print a list of elements.

import pandas as pd

import numpy as np

 

df = pd.DataFrame ({ `Geek_ID` : [ `Geek1_id` , ` Geek2_id` , `Geek3_id` ,

  `Geek4_id` , ` Geek5_id` ],

  ` Geek_A` : [ 1 , 1 , 3 , 2 , 4 ] ,

`Geek_B` : [ 1 , 2 , 3 , 4 , 6 ],

  `Geek_R` : np.random.randn ( 5 )})

 
# Geek_A Geek_B Geek_ID Geek_R
# 0 1 1 Geek1_id random number
# 1 1 2 Geek2_id random number
# 2 3 3 Geek3_id random number
# 3 2 4 Geek4_id random number
# 4 4 6 Geek5_id random number

 

print (df.Geek_ID. str . split ( `_` ). str [ 1 ]. Tolist ())

Exit :

 [`id`,` id`, `id`,` id`, `id`]