标签: netlib-java

为什么netlib-java native blas/lapack库不能提高性能?

我正在使用这段代码来计算火花推荐:

    SparkSession spark = SparkSession
            .builder()
            .appName("SomeAppName")
            .config("spark.master", "local[" + args[2] + "]")
            .config("spark.local.dir",args[4])
            .getOrCreate();
    JavaRDD<Rating> ratingsRDD = spark
            .read().textFile(args[0]).javaRDD()
            .map(Rating::parseRating);
    Dataset<Row> ratings = spark.createDataFrame(ratingsRDD, Rating.class);
    ALS als = new ALS()
            .setMaxIter(Integer.parseInt(args[3]))
            .setRegParam(0.01)
            .setUserCol("userId")
            .setItemCol("movieId")
            .setRatingCol("rating").setImplicitPrefs(true);

    ALSModel model = als.fit(ratings);
    model.setColdStartStrategy("drop");
    Dataset<Row> rowDataset = model.recommendForAllUsers(50);
Run Code Online (Sandbox Code Playgroud)

这些是maven依赖项,使这段代码工作:

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.4.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.4.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_2.11</artifactId>
        <version>2.4.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.11.8</version>
    </dependency>
Run Code Online (Sandbox Code Playgroud)

使用此代码计算建议需要大约70秒的数据文件.此代码生成以下警告:

WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
WARN BLAS: Failed to load …
Run Code Online (Sandbox Code Playgroud)

maven apache-spark-mllib netlib-java recommender-systems

5
推荐指数
1
解决办法
242
查看次数