NLP | Part of speech — default tags



What is part-of-speech (POS) tagging?
This is the process of converting sentences to forms — a list of words, a list of tuples (where each tuple is of the form (word, tag) ). The tag in the case of a word is a part of speech tag and indicates whether the word is a noun, adjective, verb, etc.

Default Marking is the basic step for part-of-speech marking. This is done using the DefaultTagger class. The DefaultTagger class takes a tag as its only argument.  NN — it is a tag for a singular noun.  DefaultTagger is most useful when it works with the most common part of speech tag. This is why the noun tag is recommended.

Code # 1: How does it work?

# Loading libraries

from nltk.tag import DefaultTagger

 
# Tag definition

tagging = DefaultTagger ( ` NN` )

 
# Tagging

tagging.tag ([ `Hello` , ` Geeks` ] )

Output:

 [(`Hello`,` NN`), (`Geeks`,` NN`)] 

Each tagger has a tag () method that accepts a list of tokens (usually a list of words generated by a word tokenizer) where each token is a separate word.  tag () returns a list of tagged tokens — a tuple of.

How does DefaultTagger work?
This is a subclass of SequentialBackoffTagger and implements a choose_tag () method that has three arguments.

  • list of tokens
  • Index of the current token to select a tag.
  • list of previous tags

Code # 2: Marking Offers

# Loading libraries

from nltk.tag import DefaultTagger

 
# Tag definition

tagging = DefaultTagger ( `NN` )

 

 

tagging.tag_sents ([[ `welcome` , `to` , ` .` ], [ `Geeks` , ` for` , `Geeks` ]])

Output:

 [[(`welcome`, `NN`), (` to`, `NN`), (` .`, `NN`)], [(` Geeks`, `NN`), (` for`, `NN`), (` Geeks `,` NN`)]] 

Note. Each tag in the tagged offer list (in the code above) is NN, because we used the DefaultTagger class .

Code # 3: Illustrating how to mark up.

from nltk.tag import untag

untag ([( `Geeks` , `NN` ), ( ` for` , ` NN` ), ( `Geeks` , `NN` )])

Output:

 [`Geeks` , `for`,` Geeks`]