错误“只能比较标记相同的系列对象"和 sort_index

Errorquot;Can only compare identically-labeled Series objectsquot; and sort_index(错误“只能比较标记相同的系列对象和 sort_index)
本文介绍了错误“只能比较标记相同的系列对象"和 sort_index的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框 df1 df2 具有相同的行数和列数以及变量,我正在尝试比较布尔变量 choice 在两个数据框中.然后使用 if/else 来操作数据.但是当我尝试比较布尔变量时似乎有些错误.

I have two dataframes df1 df2with the same numbers of rows and columns and variables, and I'm trying to compare the boolean variable choice in the two dataframes. Then use if/else to manipulate the data. But something seems wrong when I try to compare the boolean var.

这是我的数据框示例和代码:

Here are my dataframes sample and codes:

#df1
v_100     choice #boolean
7          True
0          True
7          False
2          True

#df2
v_100     choice #boolean
1          False
2          True
74         True
6          True

def lastTwoTrials_outcome():
     df1 = df.iloc[5::6, :] #df1 and df2 are extracted from the same dataframe first
     df2 = df.iloc[4::6, :]

     if df1['choice'] != df2['choice']:  # if "choice" is different in the two dataframes
         df1['v_100'] = (df1['choice'] + df2['choice']) * 0.5

这是错误:

if df1['choice'] != df2['choice']:
File "path", line 818, in wrapper
raise ValueError(msg)
ValueError: Can only compare identically-labeled Series objects

我在这里发现了同样的错误,和一个答案建议 sort_index 首先,但我真的不明白为什么?谁能详细解释一下(如果这是正确的解决方案)?

I found the same error here, and an answer suggests to sort_index first, but I don't really understand why though? Can anyone explain more in detail please (if that's the correct solution)?

谢谢!

推荐答案

我觉得你需要 reset_index 用于相同的索引值,然后是 comapare - 创建新列最好使用 masknumpy.where:

I think you need reset_index for same index values and then comapare - for create new column is better use mask or numpy.where:

另外 + 使用 | 因为使用布尔值.

Also instead + use | because working with booleans.

df1 = df1.reset_index(drop=True)
df2 = df2.reset_index(drop=True)
df1['v_100'] = df1['choice'].mask(df1['choice'] != df2['choice'],
                                  (df1['choice'] + df2['choice']) * 0.5)


df1['v_100'] = np.where(df1['choice'] != df2['choice'],
                       (df1['choice'] | df2['choice']) * 0.5,
                        df1['choice'])

样品:

print (df1)
   v_100  choice
5      7    True
6      0    True
7      7   False
8      2    True

print (df2)
   v_100  choice
4      1   False
5      2    True
6     74    True
7      6    True

<小时>

df1 = df1.reset_index(drop=True)
df2 = df2.reset_index(drop=True)
print (df1)
   v_100  choice
0      7    True
1      0    True
2      7   False
3      2    True

print (df2)
   v_100  choice
0      1   False
1      2    True
2     74    True
3      6    True

df1['v_100'] = df1['choice'].mask(df1['choice'] != df2['choice'],
                                  (df1['choice'] | df2['choice']) * 0.5)

print (df1)
   v_100  choice
0    0.5    True
1    1.0    True
2    0.5   False
3    1.0    True

这篇关于错误“只能比较标记相同的系列对象"和 sort_index的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

Leetcode 234: Palindrome LinkedList(Leetcode 234:回文链接列表)
How do I read an Excel file directly from Dropbox#39;s API using pandas.read_excel()?(如何使用PANDAS.READ_EXCEL()直接从Dropbox的API读取Excel文件?)
subprocess.Popen tries to write to nonexistent pipe(子进程。打开尝试写入不存在的管道)
I want to realize Popen-code from Windows to Linux:(我想实现从Windows到Linux的POpen-code:)
Reading stdout from a subprocess in real time(实时读取子进程中的标准输出)
How to call type safely on a random file in Python?(如何在Python中安全地调用随机文件上的类型?)