Change language

Python | Convert nested dictionary list to Pandas dataframe

|

Given a list of nested dictionaries, we will write a Python program to create a Pandas dataframe using it. Let’s walk through the step-by-step procedure for creating a Pandas Dataframe using a list of nested dictionaries.

Often Python will receive data from various sources, which can be in different formats like csv, JSON, etc., which can be converted to Python list or dictionaries, etc. But for applying computations or parsing using such packages like pandas, we need to convert this data to data frames. In this article, we will see how we can convert a given Python list, whose elements are a nested dictionary, to a Pandas Datframe.

First, we take a list of nested dictionaries and extract rows of data from it. We then create another for loop to add lines to a new list that was initially empty. Finally, we use the DataFrames function in the pandas library to create a Data Frame.

import pandas as pd

# Given nested dictionary
list = [
   {
      "Fruit": [{"Price": 15.2, "Quality": "A"},
         {"Price": 19, "Quality": "B"},
         {"Price": 17.8, "Quality": "C"},
      ],
      "Name": "Orange"
   },
   {
      "Fruit": [{"Price": 23.2, "Quality": "A"},
         {"Price": 28, "Quality": "B"}
      ],
      "Name": "Grapes"
   }
]

rows = []

# Getting rows
for data in list:
   data_row = data[’Fruit’]
   n = data[’Name’]

   for row in data_row:
      row[’Name’] = n
      rows.append(row)

# Convert to data frame
df = pd.DataFrame(rows)
print(df)

This code results in the following output:

Output #1:

Price Quality   Name
0 15.2    A Orange
1 19.0    B Orange
2 17.8    C Orange
3 23.2    A Grapes
4 28.0    B Grapes

Applying pivot

One can also apply the pivot_table method to reorganize the data in the mode we want it to be.

Example #2:

import pandas as pd

# List of nested dictionary initialization
list = [
   {
      "Fruit": [{"Price": 15.2, "Quality": "A"},
         {"Price": 19, "Quality": "B"},
         {"Price": 17.8, "Quality": "C"},
      ],
      "Name": "Orange"
   },
   {
      "Fruit": [{"Price": 23.2, "Quality": "A"},
         {"Price": 28, "Quality": "B"}
      ],
      "Name": "Grapes"
   }
]

#print(list)
rows = []

# appending rows
for data in list:
   data_row = data[’Fruit’]
   n = data[’Name’]

   for row in data_row:
      row[’Name’] = n
      rows.append(row)

   # using data frame
df = pd.DataFrame(rows)

df = df.pivot_table(index=’Name’, columns=[’Quality’],
               values=[’Price’]).reset_index()
print(df)

Running this piece of code provides the following result:

Output #2:

       
Name Price         
Quality          A    B    C
0      Grapes 23.2 28.0 NaN
1      Orange 15.2 19.0 17.8


How to convert list of nested dictionary to pandas DataFrame?

StackOverFlow question

I have some data containing nested dictionaries like below:

mylist = [{"a": 1, "b": {"c": 2, "d":3}}, {"a": 3, "b": {"c": 4, "d":3}}]

If we convert it to pandas DataFrame,

import pandas as pd 

result_dataframe = pd.DataFrame(mylist)
print(result_dataframe)

It will output:

    a   b
  0 1   {’c’: 2, ’d’: 3}
  1 3   {’c’: 4, ’d’: 3}

I want to convert the list of dictionaries and ignore the key of the nested dictionary. My code is below:

new_dataframe = result_dataframe.drop(columns=["b"])
b_dict_list = [document["b"] for document in mylist]
b_df = pd.DataFrame(b_dict_list)
frames = [new_dataframe, b_df]
total_frame = pd.concat(frames, axis=1)

The total_frame is which I want:

    a   c   d
0   1   2   3
1   3   4   3

But I think my code is a little complicated. Is there any simple way to deal with this problem? Thank you.

Answer:

Use dict comprehension with pop for extract value b and merge dictionaries:

a = [{**x, **x.pop(’b’)} for x in mylist]
print (a)
[{’a’: 1, ’c’: 2, ’d’: 3}, {’a’: 3, ’c’: 4, ’d’: 3}]

result_dataframe = pd.DataFrame(a)
print(result_dataframe)
   a  c  d
0  1  2  3
1  3  4  3

Another solution, thanks @Sandeep Kadapa :

a = [{’a’: x[’a’], **x[’b’]} for x in mylist] 
#alternative
a = [{’a’: x[’a’], **x.get(’b’)} for x in mylist] 

How to convert list of nested dictionary to pandas DataFrame?

StackOverFlow question

I am new to Python so this may be pretty straightforward, but I have not been able to find a good answer for my problem after looking for a while. I am trying to create a Pandas dataframe from a list of dictionaries.

My list of nested dictionaries is the following:

my_list = [{0: {’a’: ’23’, ’b’: ’15’, ’c’: ’5’, ’d’: ’-1’}, 
            1: {’a’: ’5’, ’b’: ’6’, ’c’: ’7’, ’d’: ’9’}, 
            2: {’a’: ’9’, ’b’: ’15’, ’c’: ’5’, ’d’: ’7’}}, 
           {0: {’a’: ’5’, ’b’: ’249’, ’c’: ’92’, ’d’: ’-4’}, 
            1: {’a’: ’51’, ’b’: ’5’, ’c’: ’34’, ’d’: ’1’}, 
            2: {’a’: ’3’, ’b’: ’8’, ’c’: ’3’, ’d’: ’11’}}]

So each key in the main dictionaries has 3 values.

Putting these into a dataframe using data = pd.DataFrame(my_list) returns something unusable, as each cell has information on a, b, c and d in it.

I want to end up with a dataframe that looks like this:

 name| a  | b  | c | d 
0    | 23 | 15 | 5 | -1 
1    | 5  | 6  | 7 |  9 
2    | 9  | 15 | 5 |  7 
0    | 5  |249 | 92| -4 
1    |51  | 5  | 34|  1 
2    | 3  | 8  | 3 | 11 

Is this possible?

Answers

Variant #1

pd.concat([pd.DataFrame(l) for l in my_list],axis=1).T

Variant #2

from itertools import chain
pd.DataFrame.from_items(list(chain.from_iterable(d.iteritems() for d in my_list))).T

In my experiments, this is faster than using pd.concat (especially when the number of "sub-dataframes" is large) at the cost of being more verbose.

Convert list of nested dictionary into pandas dataframe

# Basic syntax:
dataframe = pd.DataFrame(nested_dictionary)
dataframe = dataframe.transpose()

# Note, this only works if your nested dictionaries are set up in a 
# 	specific way. See below. 

# Create nested dictionary:
import pandas as pd
student_data = { 
0 : {
    ’name’ : ’Aadi’,
    ’age’ : 16,
    ’city’ : ’New york’
    },
1 : {
    ’name’ : ’Jack’,
    ’age’ : 34,
    ’city’ : ’Sydney’
    },
2 : {
    ’name’ : ’Riti’,
    ’age’ : 30,
    ’city’ : ’Delhi’
    }
}

# Example usage:
pandas_dataframe = pd.DataFrame(student_data) 
print(pandas_dataframe)
             0       1      2 # Outer keys become column names
age         16      34     30
city  New york  Sydney  Delhi
name      Aadi    Jack   Riti

pandas_dataframe.transpose()
  age      city  name # After transposing, inner keys become column names
0  16  New york  Aadi
1  34    Sydney  Jack
2  30     Delhi  Riti

Archived version

Step # 1: Create a list of nested dictionaries.

Step #2: Adding dictionary values to rows

Step #3: Making dataframe pivot with column names assignment

Output:

            Name  Maths  Physics  Chemistry
0  Chunky Pandey     89       80        NaN
1     Paras Jain     90       99         97

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method