Change language

Python | Convert nested dictionary list to Pandas dataframe

|

Given a list of nested dictionaries, we will write a Python program to create a Pandas dataframe using it. Let’s walk through the step-by-step procedure for creating a Pandas Dataframe using a list of nested dictionaries.

Often Python will receive data from various sources, which can be in different formats like csv, JSON, etc., which can be converted to Python list or dictionaries, etc. But for applying computations or parsing using such packages like pandas, we need to convert this data to data frames. In this article, we will see how we can convert a given Python list, whose elements are a nested dictionary, to a Pandas Datframe.

First, we take a list of nested dictionaries and extract rows of data from it. We then create another for loop to add lines to a new list that was initially empty. Finally, we use the DataFrames function in the pandas library to create a Data Frame.

import pandas as pd

# Given nested dictionary
list = [
   {
      "Fruit": [{"Price": 15.2, "Quality": "A"},
         {"Price": 19, "Quality": "B"},
         {"Price": 17.8, "Quality": "C"},
      ],
      "Name": "Orange"
   },
   {
      "Fruit": [{"Price": 23.2, "Quality": "A"},
         {"Price": 28, "Quality": "B"}
      ],
      "Name": "Grapes"
   }
]

rows = []

# Getting rows
for data in list:
   data_row = data[’Fruit’]
   n = data[’Name’]

   for row in data_row:
      row[’Name’] = n
      rows.append(row)

# Convert to data frame
df = pd.DataFrame(rows)
print(df)

This code results in the following output:

Output #1:

Price Quality   Name
0 15.2    A Orange
1 19.0    B Orange
2 17.8    C Orange
3 23.2    A Grapes
4 28.0    B Grapes

Applying pivot

One can also apply the pivot_table method to reorganize the data in the mode we want it to be.

Example #2:

import pandas as pd

# List of nested dictionary initialization
list = [
   {
      "Fruit": [{"Price": 15.2, "Quality": "A"},
         {"Price": 19, "Quality": "B"},
         {"Price": 17.8, "Quality": "C"},
      ],
      "Name": "Orange"
   },
   {
      "Fruit": [{"Price": 23.2, "Quality": "A"},
         {"Price": 28, "Quality": "B"}
      ],
      "Name": "Grapes"
   }
]

#print(list)
rows = []

# appending rows
for data in list:
   data_row = data[’Fruit’]
   n = data[’Name’]

   for row in data_row:
      row[’Name’] = n
      rows.append(row)

   # using data frame
df = pd.DataFrame(rows)

df = df.pivot_table(index=’Name’, columns=[’Quality’],
               values=[’Price’]).reset_index()
print(df)

Running this piece of code provides the following result:

Output #2:

       
Name Price         
Quality          A    B    C
0      Grapes 23.2 28.0 NaN
1      Orange 15.2 19.0 17.8


How to convert list of nested dictionary to pandas DataFrame?

StackOverFlow question

I have some data containing nested dictionaries like below:

mylist = [{"a": 1, "b": {"c": 2, "d":3}}, {"a": 3, "b": {"c": 4, "d":3}}]

If we convert it to pandas DataFrame,

import pandas as pd 

result_dataframe = pd.DataFrame(mylist)
print(result_dataframe)

It will output:

    a   b
  0 1   {’c’: 2, ’d’: 3}
  1 3   {’c’: 4, ’d’: 3}

I want to convert the list of dictionaries and ignore the key of the nested dictionary. My code is below:

new_dataframe = result_dataframe.drop(columns=["b"])
b_dict_list = [document["b"] for document in mylist]
b_df = pd.DataFrame(b_dict_list)
frames = [new_dataframe, b_df]
total_frame = pd.concat(frames, axis=1)

The total_frame is which I want:

    a   c   d
0   1   2   3
1   3   4   3

But I think my code is a little complicated. Is there any simple way to deal with this problem? Thank you.

Answer:

Use dict comprehension with pop for extract value b and merge dictionaries:

a = [{**x, **x.pop(’b’)} for x in mylist]
print (a)
[{’a’: 1, ’c’: 2, ’d’: 3}, {’a’: 3, ’c’: 4, ’d’: 3}]

result_dataframe = pd.DataFrame(a)
print(result_dataframe)
   a  c  d
0  1  2  3
1  3  4  3

Another solution, thanks @Sandeep Kadapa :

a = [{’a’: x[’a’], **x[’b’]} for x in mylist] 
#alternative
a = [{’a’: x[’a’], **x.get(’b’)} for x in mylist] 

How to convert list of nested dictionary to pandas DataFrame?

StackOverFlow question

I am new to Python so this may be pretty straightforward, but I have not been able to find a good answer for my problem after looking for a while. I am trying to create a Pandas dataframe from a list of dictionaries.

My list of nested dictionaries is the following:

my_list = [{0: {’a’: ’23’, ’b’: ’15’, ’c’: ’5’, ’d’: ’-1’}, 
            1: {’a’: ’5’, ’b’: ’6’, ’c’: ’7’, ’d’: ’9’}, 
            2: {’a’: ’9’, ’b’: ’15’, ’c’: ’5’, ’d’: ’7’}}, 
           {0: {’a’: ’5’, ’b’: ’249’, ’c’: ’92’, ’d’: ’-4’}, 
            1: {’a’: ’51’, ’b’: ’5’, ’c’: ’34’, ’d’: ’1’}, 
            2: {’a’: ’3’, ’b’: ’8’, ’c’: ’3’, ’d’: ’11’}}]

So each key in the main dictionaries has 3 values.

Putting these into a dataframe using data = pd.DataFrame(my_list) returns something unusable, as each cell has information on a, b, c and d in it.

I want to end up with a dataframe that looks like this:

 name| a  | b  | c | d 
0    | 23 | 15 | 5 | -1 
1    | 5  | 6  | 7 |  9 
2    | 9  | 15 | 5 |  7 
0    | 5  |249 | 92| -4 
1    |51  | 5  | 34|  1 
2    | 3  | 8  | 3 | 11 

Is this possible?

Answers

Variant #1

pd.concat([pd.DataFrame(l) for l in my_list],axis=1).T

Variant #2

from itertools import chain
pd.DataFrame.from_items(list(chain.from_iterable(d.iteritems() for d in my_list))).T

In my experiments, this is faster than using pd.concat (especially when the number of "sub-dataframes" is large) at the cost of being more verbose.

Convert list of nested dictionary into pandas dataframe

# Basic syntax:
dataframe = pd.DataFrame(nested_dictionary)
dataframe = dataframe.transpose()

# Note, this only works if your nested dictionaries are set up in a 
# 	specific way. See below. 

# Create nested dictionary:
import pandas as pd
student_data = { 
0 : {
    ’name’ : ’Aadi’,
    ’age’ : 16,
    ’city’ : ’New york’
    },
1 : {
    ’name’ : ’Jack’,
    ’age’ : 34,
    ’city’ : ’Sydney’
    },
2 : {
    ’name’ : ’Riti’,
    ’age’ : 30,
    ’city’ : ’Delhi’
    }
}

# Example usage:
pandas_dataframe = pd.DataFrame(student_data) 
print(pandas_dataframe)
             0       1      2 # Outer keys become column names
age         16      34     30
city  New york  Sydney  Delhi
name      Aadi    Jack   Riti

pandas_dataframe.transpose()
  age      city  name # After transposing, inner keys become column names
0  16  New york  Aadi
1  34    Sydney  Jack
2  30     Delhi  Riti

Archived version

Step # 1: Create a list of nested dictionaries.

Step #2: Adding dictionary values to rows

Step #3: Making dataframe pivot with column names assignment

Output:

            Name  Maths  Physics  Chemistry
0  Chunky Pandey     89       80        NaN
1     Paras Jain     90       99         97

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically