Change language

NLP | Trigrams & # 39; n & # 39; Tags (TnT)

| |

TnT Tagger: is a statistical tagger that works on second-order Markov models.

  • This is a very effective part-of-speech tagger that can be taught in different languages ​​and on any set of tags.
  • To generate parameters, the component is trained on labeled corpuses. It includes various methods of smoothing and handling unknown words
  • For smoothing, linear interpolation is used, the corresponding weights are determined by remote interpolation.

TnT tagger has a different API than regular tagger ... You can explicitly use the train () method after creating it.

Code # 1: Using the train () method

from nltk.tag import tnt

from nltk.corpus import treebank

 
# initialize learning and testing the suite

train_data = treebank.tagged_sents () [: 3000 ]

test_data = treebank.tagged_sents () [ 3000 :]

 
# tagger initialization

tnt_tagging = tnt.TnT ()

 
# advanced training
tnt_tagging.train (train_data)

  
# rating

a = tnt_tagging.evaluate (test_data)

 

print ( "Accuracy of TnT Tagging:" , a)

Output:

 Accuracy of TnT Tagging: 0.8756313403842003 

Understanding the work you TnT tagger:

  • Supports a number of
    • internal FreqDist
    • ConditionalFreqDist based on training data.
  • Frequency Distribution (FreqDist) counts unigrams, bigrams and trigrams.
  • These frequencies are used to calculate probabilities of possible tags for each word.
  • TnT tagger uses all ngram models together to select the best tag, instead of creating a rollback chain of subclasses of NgramTagger.
  • Based on the probabilities of each possible tag, it chooses the most likely model for the entire sentence.

Code # 2: Using the tagger for unknown words as "unk"

from nltk.tag import tnt

from nltk.corpus import treebank

from nltk.tag import DefaultTagger

 
# initialize learning and testing the suite

train_data = treebank.tagged_sents () [: 3000 ]

test_data = treebank.tagged_sents () [ 3000 :]

 
# tagger initialization

unk = DefaultTagger ( ’NN’ < / code> )

tnt_tagging = tnt.TnT (unk = unk, Trained = True )

  
# advanced training
tnt_tagging.train (train_data)

 
# rating

a = tnt_tagging.evaluate (test_data)

 

print ( "Accuracy of TnT Tagging: " , a)

Output:

 Accurac y of TnT Tagging: 0.892467083962875 
  • The tag () method of an unknown tagger is called with only one sentence.
  • A TnT tagger can pass a tagger for unknown words as unk.
  • Trained = True can be passed if this tag is already trained.
  • Otherwise it will call unk.train (data) with the same data that can be passed to the train () method.

Beam search control:

  • Another parameter that needs to be changed for TnT is N, i.e. e. it controls the value of no. possible solutions that the tagger supports.
  • The default is N = 1000.
  • The amount of memory will increase as the value of N increases without any particular improvement in precision.
  • The amount of memory will decrease if the value of N decreases, but may decrease accuracy.

Code # 3: Using N = 100

from nltk.tag import tnt

from nltk.corpus import treebank

from nltk.tag import DefaultTagger

 
# initialize learning and test set

train_data = treebank.tagged_sents () [: 3000 ]

test_data = treebank.tagged_sents () [ 3000 :]

 
# tagger initialization

tnt_tagger = tnt.TnT (N = 100 )

 
# training
tnt_tagging.train (train_data)

 
# rating

a = tnt_tagging.evaluate (test_data)

  

print ( " Accuracy of TnT Tagging: " , a)

Output:

 Accuracy of TnT Tagging: 0.8756313403842003 

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically