Convert number strings with commas in pandas DataFrame to float

StackOverflow

I have a DataFrame that contains numbers as strings with commas for the thousands marker. I need to convert them to floats.

a = [["1,200", "4,200"], ["7,000", "-0.03"], [ "5", "0"]]
df=pandas.DataFrame(a)

I am guessing I need to use locale.atof. Indeed

df[0].apply(locale.atof)

works as expected. I get a Series of floats.

But when I apply it to the DataFrame, I get an error.

df.apply(locale.atof)

TypeError: ("cannot convert the series to ", u"occurred at index 0")

and

df[0:1].apply(locale.atof)

gives another error:

ValueError: ("invalid literal for float(): 1,200", u"occurred at index 0")

So, how do I convert this DataFrame of strings to a DataFrame of floats?

Answer rating: 171

If you"re reading in from csv then you can use the thousands arg:

df.read_csv("foo.tsv", sep="	", thousands=",")

This method is likely to be more efficient than performing the operation as a separate step.


You need to set the locale first:

In [ 9]: import locale

In [10]: from locale import atof

In [11]: locale.setlocale(locale.LC_NUMERIC, "")
Out[11]: "en_GB.UTF-8"

In [12]: df.applymap(atof)
Out[12]:
      0        1
0  1200  4200.00
1  7000    -0.03
2     5     0.00




Get Solution for free from DataCamp guru