What"s a correct and good way to implement
I am talking about the function that returns a hashcode that is then used to insert objects into hashtables aka dictionaries.
__hash__() returns an integer and is used for "binning" objects into hashtables I assume that the values of the returned integer should be uniformly distributed for common data (to minimize collisions).
What"s a good practice to get such values? Are collisions a problem?
In my case I have a small class which acts as a container class holding some ints, some floats and a string.
An easy, correct way to implement
__hash__() is to use a key tuple. It won"t be as fast as a specialized hash, but if you need that then you should probably implement the type in C.
Here"s an example of using a key for hash and equality:
class A: def __key(self): return (self.attr_a, self.attr_b, self.attr_c) def __hash__(self): return hash(self.__key()) def __eq__(self, other): if isinstance(other, A): return self.__key() == other.__key() return NotImplemented
Also, the documentation of
__hash__ has more information, that may be valuable in some particular circumstances.
Spark is one of the hottest technologies in big data analysis right now, and with good reason. If you work for, or you hope to work for, a company that has massive amounts of data to analyze, Spark of...
R for Everyone: Advanced Analytics and Graphics. ...
During 2014, 2015, and 2016, surveys show that among all software developers, those with higher wages are the data engineers, the data scientists, and the data architects. This is because there is ...
Taking into account the development of modern programming, especially the emerging programming languages that reflect modern practice, Numerical Programming: A Practical Guide for Scientists and...