Dump to JSON adds additional double quotes and escaping of quotes


I am retrieving Twitter data with a Python tool and dump these in JSON format to my disk. I noticed an unintended escaping of the entire data-string for a tweet being enclosed in double quotes. Furthermore, all double quotes of the actual JSON formatting are escaped with a backslash.

They look like this:

"{"created_at":"Fri Aug 08 11:04:40 +0000 2014","id":497699913925292032,

How do I avoid that? It should be:

{"created_at":"Fri Aug 08 11:04:40 +0000 2014" .....

My file-out code looks like this:

with io.open("data"+self.timestamp+".txt", "a", encoding="utf-8") as f:
            f.write(unicode(json.dumps(data, ensure_ascii=False)))

The unintended escaping causes problems when reading in the JSON file in a later processing step.

Answer rating: 157

You are double encoding your JSON strings. data is already a JSON string, and doesn"t need to be encoded again:

>>> import json
>>> not_encoded = {"created_at":"Fri Aug 08 11:04:40 +0000 2014"}
>>> encoded_data = json.dumps(not_encoded)
>>> print encoded_data
{"created_at": "Fri Aug 08 11:04:40 +0000 2014"}
>>> double_encode = json.dumps(encoded_data)
>>> print double_encode
"{"created_at": "Fri Aug 08 11:04:40 +0000 2014"}"

Just write these directly to your file:

with open("data{}.txt".format(self.timestamp), "a") as f:
    f.write(data + "

Get Solution for free from DataCamp guru