Assume I have a pandas DataFrame with two columns, A and B. I"d like to modify this DataFrame (or create a copy) so that B is always NaN whenever A is 0. How would I achieve that?
I tried the following
df["A"==0]["B"] = np.nan
.loc for label based indexing:
df.loc[df.A==0, "B"] = np.nan
df.A==0 expression creates a boolean series that indexes the rows,
"B" selects the column. You can also use this to transform a subset of a column, e.g.:
df.loc[df.A==0, "B"] = df.loc[df.A==0, "B"] / 2
I don"t know enough about pandas internals to know exactly why that works, but the basic issue is that sometimes indexing into a DataFrame returns a copy of the result, and sometimes it returns a view on the original object. According to documentation here, this behavior depends on the underlying numpy behavior. I"ve found that accessing everything in one operation (rather than [one][two]) is more likely to work for setting.
Deep Learning for Coders with fastai and PyTorch: AI Applications Without a PhD. Deep learning is often seen as the exclusive domain of math PhDs and big tech companies. But as this how-to guide sh...
Learn how data literacy is changing the world and giving you a better understanding of life's biggest problems in this "Important and Comprehensive" Guide to Statistical Thinking (New York). The bi...
The role of adaptation, learning and optimization are becoming increasingly essen- tial and intertwined. The capability of a system to adapt either through modification of its physiological structure ...
This encyclopedia will be an indispensable resource for our time as it reflects the fact that we are currently living in an expanding data-driven world. ...