我在Pandas中使用布尔索引.问题是为什么声明:
a[(a['some_column']==some_number) & (a['some_other_column']==some_other_number)]
Run Code Online (Sandbox Code Playgroud)
工作正常,而
a[(a['some_column']==some_number) and (a['some_other_column']==some_other_number)]
Run Code Online (Sandbox Code Playgroud)
存在错误?
例:
a=pd.DataFrame({'x':[1,1],'y':[10,20]})
In: a[(a['x']==1)&(a['y']==10)]
Out: x y
0 1 10
In: a[(a['x']==1) and (a['y']==10)]
Out: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Run Code Online (Sandbox Code Playgroud) 我正进入(状态
TypeError:不可用类型:'slice'
执行以下代码时,在Python中编码分类数据.有人可以帮忙吗?
# Importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('50_Startups.csv')
y=dataset.iloc[:, 4]
X=dataset.iloc[:, 0:4]
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X = LabelEncoder()
X[:, 3] = labelencoder_X.fit_transform(X[:, 3])
Run Code Online (Sandbox Code Playgroud) 假设我有以下数据帧:
a b c d
0 0.049531 0.408824 0.975756 0.658347
1 0.981644 0.520834 0.258911 0.639664
2 0.641042 0.534873 0.806442 0.066625
3 0.764057 0.063252 0.256748 0.045850
Run Code Online (Sandbox Code Playgroud)
并且我只想要列0的值在0.5以内的列的子集.我可以做这个:
df2 = df.T
myResult = df2[df2.iloc[:, 0] > 0.5].T
Run Code Online (Sandbox Code Playgroud)
但这感觉就像一个可怕的黑客.是否有更好的方法沿列进行布尔索引?某处我可以指定一个轴参数?