NLP | Customs corps

File handling | NLP | Python Methods and Functions | String Variables

How to do this?
NLTK already defines a list of data paths or directories in  Our custom corpuses must be present in any of these paths for NLTK to find them. 
We can also create our own nltk_data directory in our home directory and make sure it is in the list of known paths given in

Code # 1: Create a custom directory and validate.

# importing libraries

import os, os.path

# using the specified path

path = os.path.expanduser ( '~ / nltk_data' )

# check

if not os.path.exis ts (path):

os.mkdir (path)


print ( "Does path exists:" , os.path.exists (path))




print ( "Does path exists in nltk:"

  path in


 Does path exists: True Does path exists in nltk: True 

Code # 2: Create a wordlist file .

# loading libraries

import ( 'corpora / cookbook / word_file.txt' , format = ' raw' )


 b' nltk '

How does it all work?

  • () recognizes formats & # 8212 ; "Raw", "pickle" and "yaml".
  • Assumes format is based on file extension if no format is specified.
  • As in the above code, format must be specified "Raw".
  • As in the above code, you need to specify the format "raw".
  • If the file ends with ".yaml", then you do not need to specify the format.

Code # 3: How to Download YAML File


# download file along the path ( 'corpora / cookbook / synonyms.yaml' )


 {'bday':' birthday'}