小编a_g*_*geo的帖子

从R中的gam.check中提取p值

当我跑步时gam.check(my_spline_gam),我得到以下输出.

Method: GCV   Optimizer: magic
Smoothing parameter selection converged after 9 iterations.
The RMS GCV score gradiant at convergence was 4.785628e-06 .
The Hessian was positive definite.
The estimated model rank was 25 (maximum possible: 25)
Model rank =  25 / 25 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

         k'    edf k-index p-value
s(x) 24.000 22.098   0.849    0.06
Run Code Online (Sandbox Code Playgroud)

我的问题是我是否可以将这个p值分别提取到表中.

r gam p-value

5
推荐指数
1
解决办法
418
查看次数

如何在 pyspark 中以秒为单位获取 datediff()?

我已经尝试过 ( this_post ) 中的代码,但无法获得以秒为单位的日期差异。我只是在下面的 'Attributes_Timestamp_fix' 和 'lagged_date' 列之间使用 datediff()。任何提示?在我的代码和输出下方。

eg = eg.withColumn("lagged_date", lag(eg.Attributes_Timestamp_fix, 1)
.over(Window.partitionBy("id")
.orderBy("Attributes_Timestamp_fix")))

eg = eg.withColumn("time_diff", 
datediff(eg.Attributes_Timestamp_fix, eg.lagged_date))

        id      Attributes_Timestamp_fix time_diff
0   3.531611e+14    2018-04-01 00:01:02 NaN
1   3.531611e+14    2018-04-01 00:01:02 0.0
2   3.531611e+14    2018-04-01 00:03:13 0.0
3   3.531611e+14    2018-04-01 00:03:13 0.0
4   3.531611e+14    2018-04-01 00:03:13 0.0
5   3.531611e+14    2018-04-01 00:03:13 0.0
Run Code Online (Sandbox Code Playgroud)

python datediff apache-spark pyspark

4
推荐指数
1
解决办法
5988
查看次数

标签 统计

apache-spark ×1

datediff ×1

gam ×1

p-value ×1

pyspark ×1

python ×1

r ×1