Change language

NLP | Filtering out irrelevant words

| | |

Code # 1: filter_insignificant () class to filter irrelevant words

def filter_insignificant ( chunk, 

tag_suffixes = [ ’DT’ , ’CC’ ]): 

  good = []

 

for word, tag in chunk:

  ok = True

 

for suffix in tag_suffixes:

if tag.endswith (suffix):

ok = False

break

 

if ok:

good.append ((word, tag))

 

return good

filter_insignificant () checks if this tag (for each tag) ends with suffix tags iterating over the tagged words in the chunk. A tagged word is skipped if the tag ends with any of the tag_suffixes . Otherwise, if all is well with the tag, the tagged word is added to the new valid snippet that is returned.

Code # 2: Using filter_insignificant () for a phrase

from transforms import filter_insignificant

 

print ( "Significant words:"

filter_insignificant ([( ’the’ , ’DT’ ), 

( ’terrible’ , ’JJ’ ), ( ’movie’ , ’ NN’ )]))

Output:

 Significant words: [(’terrible’,’ JJ’), (’movie’,’ NN’)] 

We can give different tag suffixes using filter_insignificant () ... In the code below, we are talking about pronouns and possessive words like "you", "you", "them" and "them", they are useless, but the words "DT" and "CC" are fine. Then the tag suffixes are PRP and PRP $:

Code # 3: Passing custom tag suffixes using filter_insignificant()

from transforms import filter_insignificant

 
# select tag_suffixes

print ( "Significant words:"

  filter_insignificant ([( ’your’ , ’ PRP $ ’ ), 

( ’b ook’ , ’NN’ ), ( ’is’ , ’ VBZ’ ), 

( ’great’ , ’JJ’ )], 

tag_suffixes = [ ’PRP’ , ’ PRP $ ’ ]))

Output:

 Significant words: [(’book’,’ NN’), (’is’,’ VBZ’), (’great’,’ JJ’)] 

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method