Python | Grouping similar substrings in a list



Method # 1: Using lambda + itertools.groupby () + split ()
Combining the above three functions helps us achieve our goal. The split method is key because it defines the delimiter by which to group. The groupby function groups elements.

# Python3 demo code
# group similar substrings
# using lambda + itertools.groupby () + split ()

from itertools import groupby

 
# initializing list

test_list = [ `geek_1` , `coder_2` , ` geek_4` , `coder_3` , `pro_3` ]

  
# sort the list
# required for grouping
test_list. sort ()

 
# print original list

print ( "The original list is:" + str (test_list))

 
# using lambda + itertools.groupby () + split ()
# group similar substrings

res = [ list (i) for j, i in groupby (test_list ,

lambda a: a. split ( `_` ) [ 0 ])]

 
# print result

print ( "The grouped list is:" + str (res))

Output:

The original list is: [`coder_2 `,` coder_3 `,` geek_1 `,` geek_4 `,` pro_3 `]
The grouped list is: [[` coder_2 `,` coder_3 `], [` geek_1 `,` geek_4 `], [` pro_3 `]]

Method # 2: Using la mbda + itertools.groupby () + partition ()
A similar task can also be accomplished by replacing the split function with the partition function. This is a more efficient way to accomplish this task as it uses iterators and is therefore internally faster.

# Python3 demo code
# group similar substrings
# using lambda + itertools.groupby () + partition ()

from itertools import groupby

 
# initializing list

test_list = [ `geek_1 ` , ` coder_2` , `geek_4` `coder_3` , ` pro_3` ]

 
# sort the list
# required for grouping
test_list.sort ()

 
# print original list

print ( "The original list is:" + str (test_list))

  
# using lambda + itertools.groupby () + partition ()
# group similar substrings

res = list (i) for j, i in groupby (test_list,

lambda a: a.partition ( `_ ` ) [ 0 ])]

 
# print result

print ( "The grouped list is:" + str (res))

Output:

The original list is: [`coder_2`, `coder_3`, `geek_1`, `geek_4`, `pro_3 `]
The grouped list is: [[` coder_2 `,` coder_3 `], [` geek_1 `,` geek_4 `], [` pro_3 `]]