小编Kha*_*Mei的帖子

Spark java.lang.StackOverflowError

我正在使用spark来计算用户评论的页面,但是java.lang.StackOverflowError当我在大数据集上运行我的代码时,我会不断获得Spark (40k条目).当在少量条目上运行代码时,它工作正常.

输入示例:

product/productId: B00004CK40   review/userId: A39IIHQF18YGZA   review/profileName: C. A. M. Salas  review/helpfulness: 0/0 review/score: 4.0   review/time: 1175817600 review/summary: Reliable comedy review/text: Nice script, well acted comedy, and a young Nicolette Sheridan. Cusak is in top form.
Run Code Online (Sandbox Code Playgroud)

代码:

public void calculatePageRank() {
    sc.clearCallSite();
    sc.clearJobGroup();

    JavaRDD < String > rddFileData = sc.textFile(inputFileName).cache();
    sc.setCheckpointDir("pagerankCheckpoint/");

    JavaRDD < String > rddMovieData = rddFileData.map(new Function < String, String > () {

        @Override
        public String call(String arg0) throws Exception {
            String[] data = arg0.split("\t"); …
Run Code Online (Sandbox Code Playgroud)

java mapreduce apache-spark

8
推荐指数
3
解决办法
1万
查看次数

标签 统计

apache-spark ×1

java ×1

mapreduce ×1