类型错误:无法使用 dtyped [float64] 数组和 [bool] 类型的标量执行“rand_”

dar*_*ils 20 python pandas

我在 python pandas 中运行了一个命令,如下所示:

q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:'))]]
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]
Run Code Online (Sandbox Code Playgroud)

我尝试使用的解决方案: 错误链接

相应地将代码更改为:

q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:')) & (q1_fisher_r[(q1_fisher_r['TP53']==1)])]
Run Code Online (Sandbox Code Playgroud)

但我仍然遇到相同的错误 TypeError: Cannot perform 'rand_' with a dtyped [float64] array and scalar of type [bool]

jez*_*ael 26

要按多个条件进行过滤,将它们链接起来&并过滤boolean indexing

q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r['TumorST'].str.contains(':1:')]
               ^^^^                        ^^^^
           first condition            second condition
Run Code Online (Sandbox Code Playgroud)

问题是这段代码返回了过滤后的数据,所以不能按条件链接:

q1_fisher_r[(q1_fisher_r['TumorST'].str.contains(':1:'))]
Run Code Online (Sandbox Code Playgroud)

类似问题:

q1_fisher_r[(q1_fisher_r['TP53']==1)]
Run Code Online (Sandbox Code Playgroud)

样品

q1_fisher_r = pd.DataFrame({'TP53':[1,1,2,1], 'TumorST':['5:1:','9:1:','5:1:','6:1']})
print (q1_fisher_r)
   TP53 TumorST
0     1    5:1:
1     1    9:1:
2     2    5:1:
3     1     6:1

df = q1_fisher_r[(q1_fisher_r['TP53']==1) & q1_fisher_r['TumorST'].str.contains(':1:')]
print (df)
   TP53 TumorST
0     1    5:1:
1     1    9:1:
Run Code Online (Sandbox Code Playgroud)


Hed*_*e92 7

下面的设置有类似的问题,它产生了相同的错误消息。对我来说非常简单的解决方案是将每个单独的条件放在括号之间。应该知道,但想强调以防其他人有同样的问题。

不正确的代码:

conditions = [
    (df['A'] == '15min' & df['B'].dt.minute == 15),  # Note brackets only surrounding both conditions together, not each individual condition
    df['A'] == '30min' & df['B'].dt.minute == 30,  # Note no brackets at all 
]

output = [
    df['Time'] + dt.timedelta(minutes = 45),
    df['Time'] + dt.timedelta(minutes = 30),
]

df['TimeAdjusted'] = np.select(conditions, output, default = np.datetime64('NaT'))
Run Code Online (Sandbox Code Playgroud)

正确代码:

conditions = [
        (df['A'] == '15min') & (df['B'].dt.minute == 15),  # Note brackets surrounding each condition
        (df['A'] == '30min') & (df['B'].dt.minute == 30),  # Note brackets surrounding each condition
]
    
output = [
        df['Time'] + dt.timedelta(minutes = 45),
        df['Time'] + dt.timedelta(minutes = 30),
]

df['TimeAdjusted'] = np.select(conditions, output, default = np.datetime64('NaT'))
Run Code Online (Sandbox Code Playgroud)

  • 这对我有用,将条件 & 或 | 之间的各个条件括起来。 (2认同)