小编zad*_*lik的帖子

使用CURVE_FIT在Python中拟合Lognormal分布

我有一个x的假设y函数,试图找到/拟合一个对数正态分布曲线,该曲线最好地塑造数据.我正在使用curve_fit函数并且能够适合正态分布,但曲线看起来并不优化.

下面给出y和x数据点,其中y = f(x).

y_axis = [0.00032425299473065838, 0.00063714106162861229, 0.00027009331177605913, 0.00096672396877715144, 0.002388766809835889, 0.0042233337680543182, 0.0053072824980722137, 0.0061291327849408699, 0.0064555344006149871, 0.0065601228278316746, 0.0052574034010282218, 0.0057924488798939255, 0.0048154093097913355, 0.0048619350036057446, 0.0048154093097913355, 0.0045114840997070331, 0.0034906838696562147, 0.0040069911024866456, 0.0027766995669134334, 0.0016595801819374015, 0.0012182145074882836, 0.00098231827111984341, 0.00098231827111984363, 0.0012863691645616997, 0.0012395921040321833, 0.00093554121059032721, 0.0012629806342969417, 0.0010057068013846018, 0.0006081017868837127, 0.00032743942370661445, 4.6777060529516312e-05, 7.0165590794274467e-05, 7.0165590794274467e-05, 4.6777060529516745e-05]

Run Code Online (Sandbox Code Playgroud)

y轴是在x轴时间仓中发生的事件的概率:

x_axis = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0]

Run Code Online (Sandbox Code Playgroud)

使用excel和lognormal方法,我能够更好地适应我的数据.当我尝试在python中使用lognormal时,拟合不起作用,我做错了.

下面是我用于拟合正态分布的代码,这似乎是我唯一能够适应python的代码(很难相信):

#fitting …

Run Code Online (Sandbox Code Playgroud)

python statistics numpy distribution scipy

zad*_*lik

2017 04-06

5
推荐指数

1
解决办法

2099
查看次数

使用 RandomizedSearchCV 对 XGBClassifier 进行 Python 超参数优化

我正在尝试为 XGBClassifier 获得最佳超参数，这将导致获得最多的预测属性。我正在尝试使用 RandomizedSearchCV 通过 KFold 进行迭代和验证。

当我总共运行这个过程 5 次 (numFolds=5) 时，我希望将最好的结果保存在一个名为收集器的数据框中（如下所述）。所以每次迭代，我都希望最好的结果和分数附加到收集器数据帧。

 from scipy import stats
 from scipy.stats import randint
 from sklearn.model_selection import RandomizedSearchCV
 from sklearn.metrics import 
 precision_score,recall_score,accuracy_score,f1_score,roc_auc_score

clf_xgb = xgb.XGBClassifier(objective = 'binary:logistic')
param_dist = {'n_estimators': stats.randint(150, 1000),
              'learning_rate': stats.uniform(0.01, 0.6),
              'subsample': stats.uniform(0.3, 0.9),
              'max_depth': [3, 4, 5, 6, 7, 8, 9],
              'colsample_bytree': stats.uniform(0.5, 0.9),
              'min_child_weight': [1, 2, 3, 4]
             }
clf = RandomizedSearchCV(clf_xgb, param_distributions = param_dist, n_iter = 25, scoring = 'roc_auc', error_score = 0, verbose = 3, n_jobs …

Run Code Online (Sandbox Code Playgroud)

python classification bayesian xgboost

zad*_*lik

lucky-day

4
推荐指数

1
解决办法

1万
查看次数