小编bol*_*lla的帖子

将scikit-learn(sklearn)预测添加到pandas数据框中

我正在尝试将一个sklearn预测添加到pandas数据帧中,以便我可以对预测进行全面评估.相关的代码片段如下:

clf = linear_model.LinearRegression()
clf.fit(Xtrain,ytrain)
ypred = pd.DataFrame({'pred_lin_regr': pd.Series(clf.predict(Xtest))})

Run Code Online (Sandbox Code Playgroud)

数据框看起来像这样:

XTEST

       axial_MET  cos_theta_r1  deltaE_abs  lep1_eta   lep1_pT  lep2_eta  
8000   1.383026      0.332365    1.061852  0.184027  0.621598 -0.316297   
8001  -1.054412      0.046317    1.461788 -1.141486  0.488133  1.011445   
8002   0.259077      0.429920    0.769219  0.631206  0.353469  1.027781   
8003  -0.096647      0.066200    0.411222 -0.867441  0.856115 -1.357888   
8004   0.145412      0.371409    1.111035  1.374081  0.485231  0.900024

Run Code Online (Sandbox Code Playgroud)

ytest

Run Code Online (Sandbox Code Playgroud)

ypred

        pred_lin_regr
0       0.461636
1       0.314448
2       0.363751
3       0.291858
4       0.416056

Run Code Online (Sandbox Code Playgroud)

连接Xtest和ytest工作正常:

df_total = pd.concat([Xtest, ytest], …

Run Code Online (Sandbox Code Playgroud)

python numpy pandas scikit-learn

bol*_*lla

2018 11-16

8
推荐指数

1
解决办法

1万
查看次数

pandas 数据帧中前 N 行的条件均值和总和

令人担忧的是这个示例性的熊猫数据框：

      Measurement  Trigger  Valid
   0          2.0    False   True
   1          4.0    False   True
   2          3.0    False   True
   3          0.0     True  False
   4        100.0    False   True
   5          3.0    False   True
   6          2.0    False   True
   7          1.0     True   True

Run Code Online (Sandbox Code Playgroud)

只要Trigger是True的，我希望计算金额和最后3（从当前开始）有效测量的意思。如果列Valid是，则测量被认为是有效的True。因此，让我们使用上述数据框中的两个示例进行澄清：

Index 3:2,1,0应该使用指数。预期的Sum = 9.0, Mean = 3.0
Index 7:7,6,5应该使用指数。预期的Sum = 6.0, Mean = 2.0

我曾尝试pandas.rolling创建新的、移动的列，但没有成功。请参阅我的测试中的以下摘录（应直接运行）：

import unittest
import pandas as pd
import numpy as np …

Run Code Online (Sandbox Code Playgroud)

python dataframe pandas rolling-computation rolling-sum

bol*_*lla

2019 02-18

7
推荐指数

1
解决办法

6138
查看次数

标签统计

pandas ×2

python ×2

dataframe ×1

numpy ×1

rolling-computation ×1

rolling-sum ×1

scikit-learn ×1

将scikit-learn(sklearn)预测添加到pandas数据框中

pandas 数据帧中前 N 行的条件均值和总和

标签 统计

小编bol_lla的帖子

标签统计