Way to read first few lines for pandas dataframe

| | | | | | | | | | | |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

Is there a built-in way to use read_csv to read only the first n lines of a file without knowing the length of the lines ahead of time? I have a large file that takes a long time to read, and occasionally only want to use the first, say, 20 lines to get a sample of it (and prefer not to load the full thing and take the head of it).

If I knew the total number of lines I could do something like footer_lines = total_lines - n and pass this to the skipfooter keyword arg. My current solution is to manually grab the first n lines with python and StringIO it to pandas:

import pandas as pd
from StringIO import StringIO

n = 20
with open("big_file.csv", "r") as f:
    head = "".join(f.readlines(n))

df = pd.read_csv(StringIO(head))

It"s not that bad, but is there a more concise, "pandasic" (?) way to do it with keywords or something?

👻 Read also: what is the best laptop for engineering students?

We hope this article has helped you to resolve the problem. Apart from Way to read first few lines for pandas dataframe, check other code Python module-related topics.

Want to excel in Python? See our review of the best Python online courses 2023. If you are interested in Data Science, check also how to learn programming in R.

By the way, this material is also available in other languages:



Frank Williams

New York | 2023-03-22

io Python module is always a bit confusing 😭 Way to read first few lines for pandas dataframe is not the only problem I encountered. Will use it in my bachelor thesis

Boris Zelotti

Singapore | 2023-03-22

I was preparing for my coding interview, thanks for clarifying this - Way to read first few lines for pandas dataframe in Python is not the simplest one. Will use it in my bachelor thesis

Schneider Robinson

Milan | 2023-03-22

Simply put and clear. Thank you for sharing. Way to read first few lines for pandas dataframe and other issues with time Python module was always my weak point 😁. Checked yesterday, it works!

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically