小编Dam*_*ips的帖子

Spark/Scala使用多个列上的相同函数重复调用withColumn()

我目前有代码,我通过多个.withColumn链重复将相同的过程应用于多个DataFrame列,并且我想创建一个简化过程的函数.在我的情况下,我发现由键聚合的列的累积总和:

val newDF = oldDF
  .withColumn("cumA", sum("A").over(Window.partitionBy("ID").orderBy("time")))
  .withColumn("cumB", sum("B").over(Window.partitionBy("ID").orderBy("time")))
  .withColumn("cumC", sum("C").over(Window.partitionBy("ID").orderBy("time")))
  //.withColumn(...)
Run Code Online (Sandbox Code Playgroud)

我想要的是:

def createCumulativeColums(cols: Array[String], df: DataFrame): DataFrame = {
  // Implement the above cumulative sums, partitioning, and ordering
}
Run Code Online (Sandbox Code Playgroud)

或者更好的是:

def withColumns(cols: Array[String], df: DataFrame, f: function): DataFrame = {
  // Implement a udf/arbitrary function on all the specified columns
}
Run Code Online (Sandbox Code Playgroud)

scala user-defined-functions dataframe apache-spark apache-spark-sql

15
推荐指数
2
解决办法
1万
查看次数

Pip 安装失败,因为找不到 cmake

尝试pip3 install pyportfolioopt构建失败,并显示

...ERROR: Failed building wheel for osqp
Failed to build osqp
ERROR: Could not build wheels for osqp, which is required to install pyproject.toml-based projects
Run Code Online (Sandbox Code Playgroud)

...在回溯中,我可以看到问题是

Traceback (most recent call last):
        File "/Library/Frameworks/Python.framework/Versions/3.10/bin/cmake", line 5, in <module>
          from cmake import cmake
      ModuleNotFoundError: No module named 'cmake'
Run Code Online (Sandbox Code Playgroud)

所以pip3 install cmake; pip3 list | grep cmake: cmake 3.24.0...再试一次,但同样的错误。

OSX 12.5(蒙特利)-M1 芯片 Python 3.10.6

编辑:请注意,Pythoncmake模块与构建工具不同cmake

cmake python-3.10

7
推荐指数
1
解决办法
1万
查看次数

Julia UndefVarError:未定义子类型

不清楚为什么我ERROR: LoadError: UndefVarError: subtypes not defined在执行 .jl 文件时得到 ,但从 REPL 执行时却没有。

例如

abstract type Asset end

abstract type Property <: Asset end
abstract type Investment <: Asset end
abstract type Cash <: Asset end
println(subtypes(Asset))

> 3-element Array{Any,1}:
 Cash
 Investment
 Property
Run Code Online (Sandbox Code Playgroud)

...但将相同的代码放入test.jl

julia test.jl

> ERROR: LoadError: UndefVarError: subtypes not defined
Stacktrace:
 [1] top-level scope at /.../test.jl:6
 [2] include(::Module, ::String) at ./Base.jl:377
 [3] exec_options(::Base.JLOptions) at ./client.jl:288
 [4] _start() at ./client.jl:484
in expression starting at /.../test.jl:6
Run Code Online (Sandbox Code Playgroud)

Julia …

julia

4
推荐指数
1
解决办法
363
查看次数