修改列表中的数据帧不起作用

Modifying dataFrames inside a list is not working(修改列表中的数据帧不起作用)
本文介绍了修改列表中的数据帧不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个 DataFrames,我想执行相同的清理操作列表.我意识到我可以合并为一个,并且一次通过所有内容,但我仍然很好奇为什么这种方法不起作用

I have two DataFrames and I want to perform the same list of cleaning ops. I realized I can merge into one, and to everything in one pass, but I am still curios why this method is not working

test_1 = pd.DataFrame({
    "A": [1, 8, 5, 6, 0],
    "B": [15, 49, 34, 44, 63]
})
test_2 = pd.DataFrame({
    "A": [np.nan, 3, 6, 4, 9, 0],
    "B": [-100, 100, 200, 300, 400, 500]
})

假设我只想获取没有 NaNs 的原始数据:我试过了

Let's assume I want to only take the raws without NaNs: I tried

for df in [test_1, test_2]:
    df = df[pd.notnull(df["A"])]

test_2 保持不变.另一方面,如果我这样做:

but test_2 is left untouched. On the other hand if I do:

test_2 = test_2[pd.notnull(test_2["A"])]

现在我的第一个 raw 走了.

Now I the first raw went away.

推荐答案

所有这些切片/索引操作都会创建原始数据帧的视图/副本,然后您 重新分配 df到这些视图/副本,这意味着原件根本没有被触及.

All these slicing/indexing operations create views/copies of the original dataframe and you then reassign df to these views/copies, meaning the originals are not touched at all.

选项 1
dropna(...inplace=True)
尝试就地 dropna 调用,这应该就地修改原始对象

Option 1
dropna(...inplace=True)
Try an in-place dropna call, this should modify the original object in-place

df_list = [test_1, test_2]
for df in df_list:
    df.dropna(subset=['A'], inplace=True)  

请注意,这是其中一次我会推荐就地修改,特别是因为这个用例.

Note, this is one of the few times that I will ever recommend an in-place modification, because of this use case in particular.

选项 2
enumerate 重新赋值
或者,您可以重新分配到列表 -

Option 2
enumerate with reassignment
Alternatively, you may re-assign to the list -

for i, df in enumerate(df_list):
    df_list[i] = df.dropna(subset=['A'])  # df_list[i] = df[df.A.notnull()]

这篇关于修改列表中的数据帧不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Leetcode 234: Palindrome LinkedList(Leetcode 234:回文链接列表)
How do I read an Excel file directly from Dropbox#39;s API using pandas.read_excel()?(如何使用PANDAS.READ_EXCEL()直接从Dropbox的API读取Excel文件?)
subprocess.Popen tries to write to nonexistent pipe(子进程。打开尝试写入不存在的管道)
I want to realize Popen-code from Windows to Linux:(我想实现从Windows到Linux的POpen-code:)
Reading stdout from a subprocess in real time(实时读取子进程中的标准输出)
How to call type safely on a random file in Python?(如何在Python中安全地调用随机文件上的类型?)