当我尝试使用逻辑表达式对 Pandas 数据框进行切片时,出现异常。
我的数据具有以下形式:
df
GDP_norm SP500_Index_deflated_norm
Year
1980 2.121190 0.769400
1981 2.176224 0.843933
1982 2.134638 0.700833
1983 2.233525 0.829402
1984 2.395658 0.923654
1985 2.497204 0.922986
1986 2.584896 1.09770
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 38 entries, 1980 to 2017
Data columns (total 2 columns):
GDP_norm 38 non-null float64
SP500_Index_deflated_norm 38 non-null float64
dtypes: float64(2)
memory usage: 912.0 bytes
Run Code Online (Sandbox Code Playgroud)
命令如下:
df[((df['GDP_norm'] >=3.5 & df['GDP_norm'] <= 4.5) & (df['SP500_Index_deflated_norm'] > 3)) | (
(df['GDP_norm'] >= 4.0 & df['GDP_norm'] <= 5.0) & (df['SP500_Index_deflated_norm'] < …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用 to_datetime() 方法将 Pandas 数据帧的索引转换为日期时间,但出现异常。
对于玩具可重现的示例:
我的数据:
json 格式:
'{"0":{"Id":1,"Name":"Lewis Alexander","Organization":"Nomura Securities International","Dec 2018":3.25,"June 2019":3.25,"Dec 2019":3.0,"June 2020":3.0,"Dec 2020":2.88},"1":{"Id":2,"Name":"Scott Anderson","Organization":"Bank of the West","Dec 2018":3.19,"June 2019":3.5,"Dec 2019":3.47,"June 2020":3.1,"Dec 2020":2.6},"2":{"Id":3,"Name":"Paul Ashworth","Organization":"Capital Economics","Dec 2018":3.25,"June 2019":3.0,"Dec 2019":2.5,"June 2020":2.25,"Dec 2020":2.25},"3":{"Id":4,"Name":"Daniel Bachman","Organization":"Deloitte LP","Dec 2018":3.2,"June 2019":3.4,"Dec 2019":3.5,"June 2020":3.6,"Dec 2020":3.7},"4":{"Id":5,"Name":"Bernard Baumohl","Organization":"Economic Outlook Group","Dec 2018":3.1,"June 2019":3.35,"Dec 2019":3.6,"June 2020":3.9,"Dec 2020":4.2}}'
Run Code Online (Sandbox Code Playgroud)
当我尝试时:
df3.index.to_datetime()
Run Code Online (Sandbox Code Playgroud)
我收到以下错误消息:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-53-5ff789e24651> in <module>()
----> 1 df3.index.to_datetime()
AttributeError: 'Index' object has no attribute 'to_datetime'
Run Code Online (Sandbox Code Playgroud) 我在Seaborn遇到了一个奇怪的例外。
对于可重现的示例:
toy_data.to_json()
'{"X":{"0":0.12765045,"1":0.0244816152,"2":0.1263715245,"3":0.0246376768,"4":0.1108581319,"5":0.1406719382,"6":0.1358105564,"7":0.1245863432,"8":0.1175445352,"9":0.1188479018,"10":0.1113148159,"11":0.117455495,"12":0.110555662,"13":0.1328567106,"14":0.103064284,"15":0.1119474442,"16":0.119390455,"17":0.1246727756,"18":0.1117827155,"19":0.1169972547},"Y":{"0":0.1241083714,"1":0.1394242378,"2":0.1225010796,"3":0.0077080173,"4":0.1198371354,"5":0.0029026989,"6":0.1259473297,"7":0.0,"8":0.0,"9":0.1214620231,"10":0.1204110739,"11":0.0,"12":0.1194570059,"13":0.0014971676,"14":0.1184584731,"15":0.1212061305,"16":0.1221438778,"17":0.0,"18":0.1209991075,"19":0.0},"Label":{"0":"17","1":"3","2":"17","3":"0","4":"14","5":"21","6":"16","7":"23","8":"20","9":"15","10":"14","11":"20","12":"14","13":"22","14":"13","15":"14","16":"15","17":"23","18":"14","19":"20"},"Probability":{"0":1.0,"1":1.0,"2":1.0,"3":1.0,"4":1.0,"5":1.0,"6":1.0,"7":1.0,"8":1.0,"9":1.0,"10":1.0,"11":1.0,"12":1.0,"13":1.0,"14":1.0,"15":1.0,"16":1.0,"17":1.0,"18":0.9101796407,"19":1.0}}'
toy_data.head()
X Y Label Probability
0 0.127650 0.124108 17 1.0
1 0.024482 0.139424 3 1.0
2 0.126372 0.122501 17 1.0
3 0.024638 0.007708 0 1.0
4 0.110858 0.119837 14 1.0
sns.scatterplot(x = toy_data.X, y = toy_data.Y, hue = toy_data.Label.values, alpha = 0.5)
AttributeError: 'str' object has no attribute 'view'
Run Code Online (Sandbox Code Playgroud)
使用此语法的类似异常:
sns.scatterplot(x = 'X', y = 'Y', data = toy_data, hue = 'Label', alpha = 0.5)
Run Code Online (Sandbox Code Playgroud) 我知道有一个方法 .argmax() 返回轴上最大值的索引。
但是,如果我们想要获取某个轴上 10 个最高值的索引该怎么办?
这怎么可能实现呢?
例如:
data = pd.DataFrame(np.random.random_sample((50, 40)))
Run Code Online (Sandbox Code Playgroud) 我有一个从Jupyter Notebook运行的简单Python脚本。但是,我传递给它的参数似乎被忽略了,这导致异常:
two_digits.py
import sys
input = sys.stdin.read()
tokens = input.split()
a = int(tokens[0])
b = int(tokens[1])
print(a + b)
%run two_digits 3 5
ndexError Traceback (most recent call last)
D:\Mint_ns\two_digits.py in <module>()
5 tokens = input.split()
6
----> 7 a = int(tokens[0])
8
9 b = int(tokens[1])
IndexError: list index out of range
Run Code Online (Sandbox Code Playgroud) 我编写了以下自定义评估函数与 xgboost 一起使用,以优化 F1。不幸的是,它在使用 xgboost 运行时会返回异常。
评价函数如下:
def F1_eval(preds, labels):
t = np.arange(0, 1, 0.005)
f = np.repeat(0, 200)
Results = np.vstack([t, f]).T
P = sum(labels == 1)
for i in range(200):
m = (preds >= Results[i, 0])
TP = sum(labels[m] == 1)
FP = sum(labels[m] == 0)
if (FP + TP) > 0:
Precision = TP/(FP + TP)
Recall = TP/P
if (Precision + Recall >0) :
F1 = 2 * Precision * Recall / (Precision + Recall)
else: …Run Code Online (Sandbox Code Playgroud) 我想知道为什么我的代码不起作用。我预计它会返回 11 而是创建一个异常:
def f():
counter = 1
def f1():
global counter
counter += 1
while True:
f1()
if counter>10:
return(counter)
f()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-219-0ec059b9bfe1> in <module>()
----> 1 f()
<ipython-input-218-50c23b042204> in f()
9 counter += 1
10
---> 11 f1()
12
13 if counter>10:
<ipython-input-218-50c23b042204> in f1()
7 global counter
8
----> 9 counter += 1
10
11 f1()
NameError: name 'counter' is not defined
Run Code Online (Sandbox Code Playgroud)
由于 counter 被声明为全局变量,并且由于它在 f1() 的周围环境中出现和定义 --inside f()-- 为什么我会收到此错误消息?
我想在 Pandas 中创建一个列联表。我可以用下面的代码来做,但我想知道是否有一个 Pandas 函数可以为我做这件事。
对于可重现的示例:
toy_data #json
'{"Light":{"321":"no_light","476":"night_light","342":"lamp","454":"lamp","25":"night_light","53":"night_light","120":"night_light","346":"night_light","360":"lamp","55":"no_light","391":"night_light","243":"no_light","101":"night_light","377":"night_light","124":"no_light","368":"lamp","400":"no_light","247":"night_light","270":"lamp","208":"night_light"},"Nearsightedness":{"321":"No","476":"Yes","342":"Yes","454":"Yes","25":"No","53":"Yes","120":"Yes","346":"No","360":"No","55":"Yes","391":"Yes","243":"No","101":"No","377":"Yes","124":"No","368":"No","400":"No","247":"No","270":"Yes","208":"No"}}'
toy_data.head()
Light Nearsightedness
321 no_light No
476 night_light Yes
342 lamp Yes
454 lamp Yes
25 night_light No
df = pd.DataFrame(toy_data.groupby(['Light', 'Nearsightedness']).size())
df = df.unstack('Nearsightedness')
df.columns = df.columns.droplevel()
df
Nearsightedness No Yes
Light
lamp 2 3
night_light 5 5
no_light 4 1
Run Code Online (Sandbox Code Playgroud) 以下代码在与 Keras 一起打包的 MNIST 数据上运行一个 Sequential Keras 模型,非常简单。
在运行以下代码时,我得到一个异常。
代码很容易重现。
import tensorflow as tf
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('acc')>0.99):
print("\nReached 99% accuracy so cancelling training!")
self.model.stop_training = True
mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
callbacks = myCallback()
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, callbacks=[callbacks])
Run Code Online (Sandbox Code Playgroud)
例外是:
Epoch 1/10
59296/60000 [============================>.] - ETA: 0s - loss: 0.2005 - accuracy: …Run Code Online (Sandbox Code Playgroud) 我正在学习使用 Tensorboard——Tensorflow 2.0。
特别是,我想实时监控学习曲线,并直观地检查和传达模型的架构。
下面我将提供可重现示例的代码。
我有三个问题:
虽然训练结束后我得到了学习曲线,但我不知道应该做什么来实时监控它们
我从Tensorboard得到的学习曲线与history.history的情节不符。事实上,它的逆转是奇怪且难以解释的。
我无法理解该图表。我训练了一个具有 5 个密集层和中间的 dropout 层的顺序模型。Tensorboard 向我展示的是其中包含更多元素的东西。
我的代码如下:
from keras.datasets import boston_housing
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()
inputs = Input(shape = (train_data.shape[1], ))
x1 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(inputs)
x1a = Dropout(0.5)(x1)
x2 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x1a)
x2a = Dropout(0.5)(x2)
x3 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x2a)
x3a = Dropout(0.5)(x3)
x4 = Dense(100, kernel_initializer = 'he_normal', activation = 'elu')(x3a)
x4a = Dropout(0.5)(x4)
x5 …Run Code Online (Sandbox Code Playgroud) python ×7
pandas ×5
python-3.x ×5
argmax ×1
callback ×1
contingency ×1
dataframe ×1
datetime ×1
environment ×1
function ×1
indexing ×1
jupyter ×1
keras ×1
python-3.6 ×1
scatter-plot ×1
seaborn ×1
slice ×1
tensorboard ×1
tensorflow ×1
xgboost ×1