Python | Number to words using num2words

Installation
You can easily install num2words using pip.

 pip install num2words 

Consider the following two excerpts from various files taken from 20 newsgroups, the popular NLP database. The preprocessing of 20 newsgroups continued to be of interest.

In article, Martin Preston writes: Why not use the PD C library for reading / writing TIFF files? It took me a good 20 minutes to start using them in your own app.
ISCIS VIII is the eighth of a series of meetings which have brought together computer scientists and engineers from about twenty countries. This year`s conference will be held in the beautiful Mediterranean resort city of Antalya, in a region rich in natural as well as historical sites.

In the two excerpts above, you can see that the number “20 »Appears in both numerical and alphabetical form. Simply performing preprocessing steps that include tokenization, lemmatization, etc. will not be able to map “20” and “twenty” to the same stem, which has contextual meaning. Fortunately, we have a built-in library num2words that solves this problem in one line.

Below is an example of using the tool.

from num2words import num2words

 
# Most common usage.

print (num2words ( 36 ))

 
# Other options, depending on the type of article.

print (num2words ( 36 , to = `ordinal` ))

print (num2words ( 36 , to = `ordinal_num` ))

print (num2words ( 36 , to = `year` ))

print (num2words ( 36 , to = ` currency` ))

 
# Language support.

print (num2words ( 36 , lang = `es` ))

Exit:

 thirty-six thirty-sixth 36th zero euro, thirty-six cents treinta y seis 

Therefore, in the preprocessing step, you can convert ALL numeric values ​​to words for greater precision in subsequent steps.

Links: https://pypi.org/project/num2words/