|
Example :
Input: ’You just gave me a scare’
Output: [(’You’ , ’PRP’), (’just’, ’RB’), (’gave’, ’VBD’), (’me’, ’PRP’),
(’a’, ’DT’), (’scare’, ’NN’)]
In this example, PRP stands for personal pronoun, RB — adverb, VBD — past tense verb, DT — determinant and NN — noun. We can get details of all parts of the speech tags using the Penn Treebank tag set.
|
Example:
Input: ’NN’
Output: NN: noun, common, singular or mass
common-carrier cabbage knuckle-duster Casino afghan shed thermostat
investment slide humor falloff slick wind hyena override subhumanity
machinist…
Chunking:
Splitting — it is the process of extracting phrases from unstructured text and additional structure to it. This is also known as shallow parsing. This is done on top of the part-of-speech tags. He groups the word into "chunks", mostly from nominal phrases. Partitioning is done using regular expressions.
|
In the above example, the grammar is defined using a simple regular expression rule. This rule says that an NP (Noun Phrase) block must be generated whenever the block finds an optional qualifier (DT) followed by any number of adjectives (JJ) followed by a noun (NN).
Such libraries like spaCy and Textblob are more suitable for chunking.
Example :
Input: ’the little yellow bird is flying in the sky’
Output:
(S
(NP the / DT little / JJ yellow / JJ bird / NN)
is / VBZ
flying / VBG
in / IN
(NP the / DT sky / NN))
(NP the / DT little / JJ yellow / JJ bird / NN)
(NP the / DT sky / NN)
Named Person Recognition:
Named Object Recognition is used to extract information from unstructured text. It is used to classify the entities present in the text into categories such as person, organization, event, place, etc. It gives us detailed knowledge about the text and the relationships between different entities.
|
Example :
Input: ’Bill works for Python.Engineering so he went to Delhi for a meetup.’
Output:
(S
(PERSON Bill / NNP)
works / VBZ
for / IN
(ORGANIZATION Python.Engineering / NNP)
so / RB
he / PRP
went / VBD
to / TO
(GPE Delhi / NNP)
for / IN
a / DT
meetup / NN
./.)