Python encode () and decode () functions

| | | | | | | | | | | | | | | | | |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

The encode and decode Python methods are used to encode and decode an input string using a given encoding. Let’s take a closer look at these two functions.

encode the given string

We use the encode () method on the input string that each string object has.

Format :



 input_string.encode (encoding, errors) 

This is the encoding t input_string using encoding , where errors define the behavior to follow if by some chance the string is not encoded.

encode () will result in the sequence bytes .

 inp_string = ’ Hello’ bytes_encoded = inp_string.encode () print (type (bytes_encoded)) 

As expected, the resulting object is & lt; class ’bytes’ & gt; :

 & lt; class ’bytes’ & gt; 

The type of encoding to be followed is indicated by the encoding parameter. There are different types of character encoding schemes, of which Python defaults to UTF-8.

Let’s take an example of encoding .

 a = ’This is a simple sentence.’ print (’ Original string: ’, a) # Decodes to utf-8 by default a_utf = a.encode () print (’ Encoded string : ’, a_utf) 

Output

 Original string: This is a simple sentence. Encoded string: b’This is a simple sentence.’ 

As you can see, we have encoded the input string in UTF-8 format. Although there is not much difference, you may notice that the string is prefixed with b . This means the string is converted to a stream of bytes.

This is actually only presented as the original string for readability, prefixed with b to indicate that it is not a string but a sequence bytes.

Error Handling

There are different types of errors , some of which are listed below:



Error type Behavior
strict Default behavior that raises UnicodeDecodeError on failure.
ignore Ignore unencoded Unicode from the result.
replace Replaces all unencoded Unicode characters with a question mark (?)
backslashreplace Insert a backslash escape sequence dashes (uNNNN) instead of uncoded Unicode characters.

Let’s look at the above concepts with a simple example. We will consider an input string that does not encode all characters (for example, √∂ ),

 a = ’This is a bit m√∂re c√∂mplex sentence.’ print (’Original string:’, a) print (’Encoding with errors = ignore:’, a.encode (encoding = ’ascii’, errors =’ ignore’)) print (’Encoding with errors = replace: ’, a.encode (encoding =’ ascii’, errors = ’replace’)) 

Output

 Original string: This is a m√∂re c√∂mplex sentence. Encoding with errors = ignore: b’This is a bit mre cmplex sentence.’ Encoding with errors = replace: b’This is a bit m? Re c? Mplex sentence.’ 

Decoding a byte stream

Similar to encoding a string, we can decode a byte stream into a string object using the decode () function.

Format:

 encoded = input_string.encode () # Using decode () decoded = encoded.decode (decoding, errors) 

Because encode () converts the string to bytes, decode () just does the opposite.

 byte_seq = b’Hello’ decoded_string = byte_seq .decode () print (type (decoded_string)) print (decoded_string) 

Output

 & lt; class ’str’ & gt; Hello 

This indicates that decode () converts bytes to a Python string.

Similar to encode () options, decoding defines the type of encoding from which the byte sequence is decoded. The errors parameter specifies the behavior in case of decoding failure, which has the same values ‚Äã‚Äãas encode () .

Encoding importance

Since the encoding and decoding of the input string is format dependent, we must be careful with these operations. If we use the wrong format, it will lead to incorrect output and may cause errors.

The first decoding is wrong because it tries to decode the input string that is encoded in UTF-8 format. The second is correct because the encoding and decoding formats are the same.

 a = ’This is a bit m√∂re c√∂mplex sentence.’ print (’ Original string: ’ , a) # Encoding in UTF-8 encoded_bytes = a.encode (’utf-8’,’ replace’) # Trying to decode via ASCII, which is incorrect decoded_incorrect = encoded_bytes.decode (’ascii’,’ replace’) decoded_correct = encoded_bytes.decode (’utf-8’,’ replace’) print (’Incorrectly Decoded string:’, decoded_incorrect) print (’Correctly Decoded string:’, decoded_correct) 

Output

 Original string: This is a bit möre cömplex sentence. Incorrectly Decoded string: This is a bit m re c mplex sentence. Correctly Decoded string: This is a bit möre cömplex sentence. 



👻 Read also: what is the best laptop for engineering students?

We hope this article has helped you to resolve the problem. Apart from Python encode () and decode () functions, check other code Python module-related topics.

Want to excel in Python? See our review of the best Python online courses 2023. If you are interested in Data Science, check also how to learn programming in R.

By the way, this material is also available in other languages:



Angelo Sikorski

Vigrinia | 2023-02-08

Simply put and clear. Thank you for sharing. Python encode () and decode () functions and other issues with imp Python module was always my weak point 😁. Will get back tomorrow with feedback

Olivia Jackson

Massachussetts | 2023-02-08

Maybe there are another answers? What Python encode () and decode () functions exactly means?. Will get back tomorrow with feedback

Manuel Gonzalez

Massachussetts | 2023-02-08

I was preparing for my coding interview, thanks for clarifying this - Python encode () and decode () functions in Python is not the simplest one. Will get back tomorrow with feedback

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically