Change language

Python | PoS tagging and lemmatization using spaCy

| | | |

How to install?

 pip install spacy python -m spacy download en_core_web_sm 

SpaCy main features:
1. Non-destructive tokenization
2. Named object recognition
3. Support for more than 49 languages ​​
4. 16 statistical models for 9 languages ​​
5. Pretrained word vectors
6. Part of speech tagging
7. Marked up dependency parsing
8. Syntactic segmentation of sentences

Import and load library:

import spacy

 
# python -m spacy download en_core_web_sm

nlp = spacy.load ( "en_core_web_sm" )

POS tags for reviews:

This is a method of identifying words as nouns, verbs, adjectives, adverbs, etc.

import spacy

 
# Load english tokenizer, tagger,
# parser, NER and word vectors

nlp = spacy.load ( " en_core_web_sm " )

 
# Integer handling documents

text = (  & quot; & quot; & quot; My name is Shaurya Uppal.

I like writing articles on Python.Engineering checkout

my other article by going to my profile section. & quot; & quot; & quot; )

 

doc = nlp (text)

  
# Token and tag

for token in doc:

  print (token, token.pos_)

 
# You want a list of verb tokens

print ( "Verbs:" , [token.text for token in doc if token.pos_ = = " VERB " ])

Output:

 My DET name NOUN is VERB Shaurya PROPN Uppal PROPN. PUNCT I PRON enjoy VERB writing VERB articles NOUN on ADP Python.Engineering PROPN checkout VERB my DET other ADJ article NOUN by ADP going VERB to ADP my DET profile NOUN section NOUN. PUNCT # Verb based Tagged Reviews: - Verbs: [’is’,’ enjoy’, ’writing’,’ checkout’, ’going’] 

lemmatization:

This is the process of grouping curved word forms so that they can be parsed as a single element, identified by a word lemma or dictionary form.

import spacy

 
# Download the English tokenizer, tagger,
# parser, NER and word vectors

nlp = spacy.load ( "en_core_web_sm" )

 
# Processing entire documents

text = ( & quot; & quot; “My name is Shaurya Uppal. I like to write

articles about Python.Engineering checkout my other

article by going to my profile section. & quot; & quot; & quot; )

 

doc = nlp (text)

  

for token in doc:

print (token, token.lemma_)

Output:

 My -PRON- name name is be Shaurya Shaurya Uppal Uppal. ... I -PRON- enjoy enjoy writing write articles article on on Python.Engineering Python.Engineering checkout checkout my -PRON- other other article article by by going go to to my -PRON- profile profile section section. ... 

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method