NLP | Chunking and Chinking with RegEx

NLP | Python Methods and Functions | Regular Expressions

Defining Chunk Patterns:
Chuck Patterns — These are regular regular expressions that have been modified and designed to match a part-of-speech tag designed to match sequences of part-of-speech tags. Angle brackets are used, for example, to indicate the individual tag —  to match the noun tag . You can define multiple tags in the same way.

Code # 1: Convert snippets to RegEx pattern.

# Downloadable library

from nltk.chunk.regexp import tag_pattern2re_pattern

  
# Chunk Pattern to RegEx pattern

print ( " Chunk Pattern: " , tag_pattern2re_pattern ( ' & lt; DT & gt;? & lt; NN. * & gt; + ' ))

Output:

 Chunk Pattern: ()? (& lt; (NN [^ {}] *) & gt;) + 

Curly braces used for specifying a chunk of type {} and to define a chink template, you can simply reverse the curly braces } {. For a certain type of phrase, these rules (chunk and chink pattern) can be combined into a grammar.

Code # 2: Parsing a sentence with RegExParser.

from nltk.chunk import RegexpParser

 
# Introducing the template

chunker = RegexpParser (r "" "

NP:
{& lt; DT & gt; & lt; NN. * & gt; & lt ;. * & gt; * & lt; NN. * & gt;}
} & lt; VB. * & gt; {

"" " )

  

chunker.parse ([( ' the' , 'DT ' ), ( ' book' , 'NN' ), (

  'has' , ' VBZ' ) , ( 'many' , ' JJ' ), ( 'chapters' , 'NNS' )])

Output:

 Tree ('S', [Tree (' NP', [('the',' DT'), ('book' , 'NN')]), (' has', 'VBZ'), Tree (' NP', [('many',' JJ'), ('chapters',' NNS')])]) 




Get Solution for free from DataCamp guru