What is part-of-speech (POS) tagging?
This is the process of converting sentences to forms — a list of words, a list of tuples (where each tuple is of the form (word, tag) ). The tag in the case of a word is a part of speech tag and indicates whether the word is a noun, adjective, verb, etc.
Default Marking is the basic step for part-of-speech marking. This is done using the DefaultTagger class. The
DefaultTagger class takes a tag as its only argument. NN — it is a tag for a singular noun.
DefaultTagger is most useful when it works with the most common part of speech tag. This is why the noun tag is recommended.
Code # 1: How does it work?
[(`Hello`,` NN`), (`Geeks`,` NN`)]
Each tagger has a
tag () method that accepts a list of tokens (usually a list of words generated by a word tokenizer) where each token is a separate word.
tag () returns a list of tagged tokens — a tuple of.
How does DefaultTagger work?
This is a subclass of
SequentialBackoffTagger and implements a
choose_tag () method that has three arguments.
Code # 2: Marking Offers
[[(`welcome`, `NN`), (` to`, `NN`), (` .`, `NN`)], [(` Geeks`, `NN`), (` for`, `NN`), (` Geeks `,` NN`)]]
Note. Each tag in the tagged offer list (in the code above) is NN, because we used the
DefaultTagger class .
Code # 3: Illustrating how to mark up.
[`Geeks` , `for`,` Geeks`]