NLP | Storing Conditional Frequency Distribution in Redis



Code :

from nltk.probability import ConditionalFreqDist

from rediscollections import encode_key

  

class RedisConditionalHashFreqDist (ConditionalFreqDist):

def __ init __ ( self , r, name, cond_samples = None ):

self ._ r = r

  self ._ name = name

ConditionalFreqDist .__ init __ ( self , cond_samples)

 

for key in self ._ r.keys (encode_key ( `% s: *` % name)):

condition = key.split ( `: ` ) [ 1 ]

  # calls itself .__ getitem __ (condition)

self [condition] 

 

def __ getitem __ ( self , condition):

if condition not in self ._ fdists:

key = `% s:% s` % ( self ._ name, condition)

val = RedisHashFreqDist ( self ._ r, key)

super (RedisConditionalHashFreqDist, self ) .__ setitem __ (

condition, val)

return super (

RedisConditionalHashFreqDist, self ) .__ getitem __ ( condition)

def clear ( self ):

for fdist in self . values ​​():

fdist.clear ()

This class can be instantiated by passing in the Redis connection and base name. After that it works the same as ConditionalFreqDist as shown in the code below:
Code:

from redis import Redis

from redisprob import RedisConditionalHashFreqDist

 

r = Redis ()

rchfd = RedisConditionalHashFreqDist (r, `condhash` )

 

print (rchfd.N ())  

 

print (rchfd .conditions ())

 

rchfd [ `cond1` ] [ ` foo` ] + = 1

 

print (rchfd.N () )

 

print (rchfd [ ` cond1` ] [ `foo` ])

 

print (rchfd.conditions ())

  
rchfd.clear ()


Output:

 0 [] 1 1 [`cond1`] 

How does this work?

  • RedisConditionalHashFreqDist uses name prefixes to refer to instances of RedisHashFreqDist.
  • The name passed to RedisConditionalHashFreqDist is the base name which is combined with each condition to create a unique name for each RedisHashFreqDist .
  • For example, if the base name is RedisConditionalHashFreqDist —  “condhash” and the condition — “Cond1” then the final name of RedisHashFreqDist —  “condhash: cond1” .
  • This naming pattern is used during initialization to find all existing hash maps using the keys command.
  • When searching for all keys, matching & # 39; condhash: * & # 39 ;, the user can identify all existing conditions and instantiate RedisHashFreqDist for each.
  • Colon string concatenation is a common naming convention for Redis keys as a way to define spaces names.
  • Each instance of RedisConditionalHashFreqDist defines one hash card namespace.

RedisConditionalHashFreqDist also defines a clear () method. This is a helper method that calls clear () on all internal RedisHashFreqDist instances. The clear () method is not defined in ConditionalFreqDist .