Given a list of nested dictionaries, we will write a Python program to create a Pandas dataframe using it. Let’s walk through the step-by-step procedure for creating a Pandas Dataframe using a list of nested dictionaries.
Often Python will receive data from various sources, which can be in different formats like csv, JSON, etc., which can be converted to Python list or dictionaries, etc. But for applying computations or parsing using such packages like pandas, we need to convert this data to data frames. In this article, we will see how we can convert a given Python list, whose elements are a nested dictionary, to a Pandas Datframe.
First, we take a list of nested dictionaries and extract rows of data from it. We then create another for loop to add lines to a new list that was initially empty. Finally, we use the DataFrames function in the pandas library to create a Data Frame.
import pandas as pd # Given nested dictionary list = [ { "Fruit": [{"Price": 15.2, "Quality": "A"}, {"Price": 19, "Quality": "B"}, {"Price": 17.8, "Quality": "C"}, ], "Name": "Orange" }, { "Fruit": [{"Price": 23.2, "Quality": "A"}, {"Price": 28, "Quality": "B"} ], "Name": "Grapes" } ] rows = [] # Getting rows for data in list: data_row = data[’Fruit’] n = data[’Name’] for row in data_row: row[’Name’] = n rows.append(row) # Convert to data frame df = pd.DataFrame(rows) print(df)
This code results in the following output:
Output #1:
Price Quality Name 0 15.2 A Orange 1 19.0 B Orange 2 17.8 C Orange 3 23.2 A Grapes 4 28.0 B Grapes
Applying pivot
One can also apply the pivot_table method to reorganize the data in the mode we want it to be.
Example #2:
import pandas as pd # List of nested dictionary initialization list = [ { "Fruit": [{"Price": 15.2, "Quality": "A"}, {"Price": 19, "Quality": "B"}, {"Price": 17.8, "Quality": "C"}, ], "Name": "Orange" }, { "Fruit": [{"Price": 23.2, "Quality": "A"}, {"Price": 28, "Quality": "B"} ], "Name": "Grapes" } ] #print(list) rows = [] # appending rows for data in list: data_row = data[’Fruit’] n = data[’Name’] for row in data_row: row[’Name’] = n rows.append(row) # using data frame df = pd.DataFrame(rows) df = df.pivot_table(index=’Name’, columns=[’Quality’], values=[’Price’]).reset_index() print(df)
Running this piece of code provides the following result:
Output #2:
Name Price Quality A B C 0 Grapes 23.2 28.0 NaN 1 Orange 15.2 19.0 17.8
How to convert list of nested dictionary to pandas DataFrame?
StackOverFlow question
I have some data containing nested dictionaries like below:
mylist = [{"a": 1, "b": {"c": 2, "d":3}}, {"a": 3, "b": {"c": 4, "d":3}}]
If we convert it to pandas DataFrame,
import pandas as pd
result_dataframe = pd.DataFrame(mylist)
print(result_dataframe)
It will output:
a b
0 1 {’c’: 2, ’d’: 3}
1 3 {’c’: 4, ’d’: 3}
I want to convert the list of dictionaries and ignore the key of the nested dictionary. My code is below:
new_dataframe = result_dataframe.drop(columns=["b"])
b_dict_list = [document["b"] for document in mylist]
b_df = pd.DataFrame(b_dict_list)
frames = [new_dataframe, b_df]
total_frame = pd.concat(frames, axis=1)
The total_frame is which I want:
a c d
0 1 2 3
1 3 4 3
But I think my code is a little complicated. Is there any simple way to deal with this problem? Thank you.
Answer:
Use
dict comprehension
withpop
for extract valueb
and merge dictionaries:a = [{**x, **x.pop(’b’)} for x in mylist] print (a) [{’a’: 1, ’c’: 2, ’d’: 3}, {’a’: 3, ’c’: 4, ’d’: 3}] result_dataframe = pd.DataFrame(a) print(result_dataframe) a c d 0 1 2 3 1 3 4 3
Another solution, thanks @Sandeep Kadapa :
a = [{’a’: x[’a’], **x[’b’]} for x in mylist] #alternative a = [{’a’: x[’a’], **x.get(’b’)} for x in mylist]
How to convert list of nested dictionary to pandas DataFrame?
StackOverFlow question
I am new to Python so this may be pretty straightforward, but I have not been able to find a good answer for my problem after looking for a while. I am trying to create a Pandas dataframe from a list of dictionaries.
My list of nested dictionaries is the following:
my_list = [{0: {’a’: ’23’, ’b’: ’15’, ’c’: ’5’, ’d’: ’-1’},
1: {’a’: ’5’, ’b’: ’6’, ’c’: ’7’, ’d’: ’9’},
2: {’a’: ’9’, ’b’: ’15’, ’c’: ’5’, ’d’: ’7’}},
{0: {’a’: ’5’, ’b’: ’249’, ’c’: ’92’, ’d’: ’-4’},
1: {’a’: ’51’, ’b’: ’5’, ’c’: ’34’, ’d’: ’1’},
2: {’a’: ’3’, ’b’: ’8’, ’c’: ’3’, ’d’: ’11’}}]
So each key in the main dictionaries has 3 values.
Putting these into a dataframe using data = pd.DataFrame(my_list)
returns something unusable, as each cell has information on a, b, c and d in it.
I want to end up with a dataframe that looks like this:
name| a | b | c | d
0 | 23 | 15 | 5 | -1
1 | 5 | 6 | 7 | 9
2 | 9 | 15 | 5 | 7
0 | 5 |249 | 92| -4
1 |51 | 5 | 34| 1
2 | 3 | 8 | 3 | 11
Is this possible?
Answers
Variant #1
pd.concat([pd.DataFrame(l) for l in my_list],axis=1).T
Variant #2
from itertools import chain pd.DataFrame.from_items(list(chain.from_iterable(d.iteritems() for d in my_list))).T
In my experiments, this is faster than using pd.concat (especially when the number of "sub-dataframes" is large) at the cost of being more verbose.
Convert list of nested dictionary into pandas dataframe
# Basic syntax: dataframe = pd.DataFrame(nested_dictionary) dataframe = dataframe.transpose() # Note, this only works if your nested dictionaries are set up in a # specific way. See below. # Create nested dictionary: import pandas as pd student_data = { 0 : { ’name’ : ’Aadi’, ’age’ : 16, ’city’ : ’New york’ }, 1 : { ’name’ : ’Jack’, ’age’ : 34, ’city’ : ’Sydney’ }, 2 : { ’name’ : ’Riti’, ’age’ : 30, ’city’ : ’Delhi’ } } # Example usage: pandas_dataframe = pd.DataFrame(student_data) print(pandas_dataframe) 0 1 2 # Outer keys become column names age 16 34 30 city New york Sydney Delhi name Aadi Jack Riti pandas_dataframe.transpose() age city name # After transposing, inner keys become column names 0 16 New york Aadi 1 34 Sydney Jack 2 30 Delhi Riti
Archived version
Step # 1: Create a list of nested dictionaries.
Step #2: Adding dictionary values to rows
Step #3: Making dataframe pivot with column names assignment
Output:
Name Maths Physics Chemistry 0 Chunky Pandey 89 80 NaN 1 Paras Jain 90 99 97