pickle — serializing Python objects

Arrays | Counters | File handling | Python Methods and Functions

The pickle module is used to implement binary protocols for serializing and deserializing the structure of Python objects.

  • Pickling: This is the process by which a hierarchy of Python objects is converted to a stream of bytes.
  • Detachment: This is the reverse of Pickling, converting a stream of bytes into an object hierarchy.

Module interface:

  • dumps ( ) — this function is called to serialize the object hierarchy.
  • load () — this function is called to deserialize the data stream.
  • For more control over serialization and deserialization, Pickler or Unpickler objects are created respectively.

Constants provided by the Pickle module:

  1. pickle.HIGHEST_PROTOCOL
    This is an integer value representing the highest protocol version available. This is considered a protocol value that is passed to dump (), dumps ().
  2. pickle.DEFAULT_PROTOCOL
    This is an integer value representing the default protocol used for pickle, which may be less than the highest protocol value.

Functions provided by the pickle module:

  1. pickle .dump (obj, file, protocol = None, *, fix_imports = True)
    This function is equivalent to Pickler (file, protocol) .dump (obj). This is used to write the pickled representation of obj to the object file of the open file.

    Optional protocol argument — it is an integer that tells the collector to use this protocol. Supported protocols — 0 to HIGHEST_PROTOCOL. If not specified, the default is DEFAULT_PROTOCOL. If a negative number is specified, HIGHEST_PROTOCOL is selected.

    If fix_imports is true and the protocol is less than 3, pickle will try to match the new Python 3 names with the old module names used in Python 2 so that the pickle data stream is readable with Python 2.
    Example:

    # Python program for illustration
    # pickle.dump ()

    import pickle

    from StringIO import StringIO

      

    class SimpleObject ( object ):

      

      def __ init __ ( self , name):

    self . name = name

    l = list (name)

    l.reverse ()

    self . name_backwards = '' .join (l)

    return

     

    data = []

    data.append (SimpleObject ( 'pickle' ))

    data.append (SimpleObject ( 'cPickle' ))

    data.append ( SimpleObject ( 'last' ))

      
    # Simulate a file using StringIO

    out_s = StringIO ()

     
    # Write to stream

    for o in data:

    print ' WRITING:% s (% s) ' % (o.name, o.name_backwards)

    pickle.dump (o, out_s)

    out_s.flush ()

    Output:

     WRITING: pickle (elkcip) WRITING: cPickle (elkciPc) WRITING: last (tsal) 
  2. pickle.dumps (obj, protocol = None, *, fix_imports = True)
    This function returns a pickle representation of an object as a byte object.

    Example:

    # Python program for illustration
    # Picle.dumps ()

    import pickle

     

    data = [{ 'a ' : ' A' , 'b' : 2 , 'c' : 3.0 } ]

    data_string = pickle.dumps ( data)

    print 'PICKLE:' , data_string

    Output:

     PICKLE: (lp0 (dp1 S'a' p2 S'A' p3 sS'c' p4 F3.0 sS'b' p5 I2 sa. 
  3. pickle.load (file, *, fix_imports = True, encoding = "ASCII", errors = "strict")
    This function is equivalent to Unpickler (file) .load (). This function is used to read a pickled object representation from an open object file and return the specified hierarchy of recovered objects.

    Example:

    # Python program for illustration
    # pickle.load ()

    import pickle

    from StringIO import StringIO

     

    class SimpleObject ( object ):

     

    def __ init __ ( self , name):

    self . name = name

    l = list (name)

    l.reverse ()

      self .name_backwards = ' '.join (l)

    return

     

    data = []

    data.append (SimpleObject ( 'pickle ' ))

    data.append (SimpleObject ( ' cPickle' ))

    data.append (SimpleObject ( 'last' ))

      
    # Simulate a file using StringIO

    out_s = StringIO ()

     

     
    # Write to stream

    for o in data:

    print ' WRITING:% s (% s) ' % (o.name, o.name_backwar ds)

    pickle.dump (o, out_s)

    out_s.flush ()

     

     
    # Setting the read stream

    in_s = StringIO (out_s.getvalue ())

     
    # Read data

    while True :

      try :

    o = pickle.load (in_s)

    except EOFError:

      break

    else :

    print 'READ:% s (% s ) ' % (o.name, o.name_backwards)

    Output:

     WRITING: pickle (elkcip) WRITING: cPickle ( elkciPc) WRITING: last (tsal) READ: pickle (elkcip) READ: cPickle (elkciPc) READ: last (tsal) 
  4. pickle.loads (bytes_object, *, fix_imports = True, encoding = "ASCII", errors = "strict")
    This function is used to read a pickled object representation from a byte object and return the specified hierarchy of the restored object.

    Example:

    # Python program for illustration
    # pickle.loads ()

    import pickle

    import pprint

     

    data1 = [{ 'a' : 'A' , ' b' : 2 , 'c' : 3.0 }]

    print 'BEFORE:' ,

    pprint.pprint (data1)

      

    data1_string = pickle.dumps (data1)

     

    data2 = pickle.loads (data1_string )

    print 'AFTER:' ,

    pprint.pprint (data2)

     

    print 'SAME?:' , (data1 is data2)

    print 'EQUAL?:' , (data1 = = data2)

    Output:

     BEFORE: [{'a':' A', 'b': 2, 'c': 3.0}] AFTER: [{' a': 'A',' b': 2, 'c': 3.0}] SAME ?: False EQUAL ?: True 

Exceptions provided by the Pickle module:

  1. pickle.PickleError
    This exception inherits from the exception. It is the base class for all other pickle exceptions.
  2. pickle.PicklingError
    This exception inherits PickleError. This exception is thrown when Pickler encounters an unhandled object.
  3. pickle.UnpicklingError
    This exception inherits PickleError. This exception is thrown when an issue such as data corruption or security breach occurs when unpinning an object.

Classes exported by the pickle module:

  1. pickle.Pickler class (file, protocol = no, *, fix_imports = True)
    This class accepts a binary file to record the pickle data stream.
    1. dump (obj) — this function is used to write the pickled view obj to the open file object specified in the constructor.
    2. persistent_id (obj) — If persistent_id () returns None, obj is sucked in as usual. This does nothing by default and exists so any subclass can override it.
    3. Dispatch_table — Dispatch table of the picker object — is a mapping whose keys are classes and the values ​​— reduction functions.
      By default, the picker object will not have a dispatch_table attribute and will instead use the global dispatch table managed by the copyreg module.

      Example: The code below creates a pickle.Pickler instance with a private a send table that specifically handles the SomeClass class.

       f = io.BytesIO () p = pickle.Pickler (f) p.dispatch_table = copyreg.dispatch_table.copy () p .dispatch_table [SomeClass] = reduce_SomeClass 
    4. Fast — Fast mode disables the use of the memo and speeds up the etching process without generating unnecessary PUT codes.
  2. pickle.Unpickler class (file, *, fix_imports = True, encoding = "ASCII", errors = "strict")

    This class accepts a binary file to read the pickle data stream.
    1. load () — This function is used to read the pickle object representation from the open file object file and returning the specified hierarchy of restored objects.
    2. persistent_load (pid) — this raises a default UnpicklingError.
    3. find_class (module, name ) — this function imports a module if required and returns an object named name from it, where the arguments module and name are str objects.
  3. What can be pickled and pickled?
    The following types can be pickled:

    • No, true and false
    • integers , floats, complex numbers
    • strings, bytes, byte arrays
    • tuples, lists, sets and dictionaries containing only selectable objects
    • functions defined at the top module level (using def, not lambda)
    • inline functions defined at the top module level
    • classes that are defined at the top module level
    • instances such classes, whose __dict__ or the result of calling __getstate __ () can you pick

    Pickling Class Instances:
    This section explains the general mechanisms available to define, customize, and control how class instances are selected and selected.
    No additional code is required to make the instances selectable. By default, pickle retrieves class and instance attributes through introspection.

    Classes can change the default behavior by providing one or more special methods:

    1. Object .__ getnewargs_ex __ ()
      This method dictates the values ​​passed to the __new __ () method after splitting. The method should return a pair (args, kwargs), where args — is a tuple of positional arguments, and kwargs — a dictionary of named arguments to build an object.
    2. object .__ getnewargs __ ()
      This method only supports positive arguments. It should return a tuple of argument arguments that will be passed to the __new __ () method after splitting.
    3. Object .__ GetState __ ()
      If this method is defined by classes, it is called and the returned object is fetched as content for the instance, not as the content of the instance dictionary.
    4. Object .__ SetState __ (state)
      If this method is defined by the classes, it invoked with an unpickled state. The selected state must be a dictionary, and its elements are assigned to the dictionary of the new instance.
    5. Object .__ reduce __ ()
      The __reduce __ () method takes no arguments and must return either a string, or preferably a tuple.
    6. Object .__ reduce_ex __ (protocol)
      This method is similar to the __reduce__ method. One integer argument is required. The main use of this method is to provide backward compatible decrement values ​​for older versions of Python.

    Example: Handling Stateful Objects
    This example shows how to change the etching behavior for a class. The TextReader class opens a text file and returns the line number and content each time the readline () method is called.

    1. If a TextReader instance is selected, all attributes except the file object element are preserved.
    2. When no instance is selected, the file is opened again and reading resumes from the last location.

    class TextReader:

      & quot; & quot; "Print and number lines in a text file." & quot; & quot;

     

    def __ init __ ( self , filename):

    self . filename = filename

    self . file = open (filename)

      self . lineno = 0

     

    def readline ( self ):

    self . lineno + = 1

    line = self . file .readline ()

      if not line:

    return None

    if line.endswith ( '' ):

    line = line [: - 1 ]

    return "% i:% s" % ( self . lineno, line)

     

    def __ getstate __ ( self ):

    # Copy the object state from yourself. __ dict__ which contains

    # all our instance attributes. Always use dict.copy ()

    # a method to avoid changing the original state.

    state = self .__ dict __. copy ()

    # Remove irreversible entries.

    del state [ 'file' ]

    return state

     

      def __ setstate __ ( self , state):

      # Restore instance attributes (i.e. E. File and linen name).

    self .__ dict __. update (state)

      # Restore the state of a previously opened file. To do this, we need

    # open it and read from it until the line count is restored .

    file = open ( self . filename )

    for _ in range ( self . lineno):

    file . readline ()

    # Finally, save file.

    self . file = file

     

    reader = TextReader ( "hello.txt" )

    print (reader.readline ())

    print (reader.readline ())

    new_reader = pickle.loads (pickle.dumps (reader))

    print (new_reader.readline ())

    Output:

     '1: Hello world!' '2: I am line number two. '' 3: Goodbye!' 

    This article courtesy of Aditi Gupta . If you are as Python.Engineering and would like to contribute, you can also write an article using contribute.python.engineering or by posting the article [email protected] ... See my article appearing on the Python.Engineering homepage and help other geeks.

    Please write in comments if you find anything wrong or if you'd like to share more information on the topic discussed above.