Java中的PCA实现

Tru*_*rup 8 java pca

我需要在Java中实现PCA.我有兴趣找到一些记录良好,实用且易于使用的东西.有什么建议?

Lot*_*oti 14

现在有许多针对Java的Principal Component Analysis实现.

  1. Apache Spark:https://spark.apache.org/docs/2.1.0/mllib-dimensionality-reduction.html#principal-component-analysis-pca

    SparkConf conf = new SparkConf().setAppName("PCAExample").setMaster("local");
    try (JavaSparkContext sc = new JavaSparkContext(conf)) {
        //Create points as Spark Vectors
        List<Vector> vectors = Arrays.asList(
                Vectors.dense( -1.0, -1.0 ),
                Vectors.dense( -1.0, 1.0 ),
                Vectors.dense( 1.0, 1.0 ));
    
        //Create Spark MLLib RDD
        JavaRDD<Vector> distData = sc.parallelize(vectors);
        RDD<Vector> vectorRDD = distData.rdd();
    
        //Execute PCA Projection to 2 dimensions
        PCA pca = new PCA(2); 
        PCAModel pcaModel = pca.fit(vectorRDD);
        Matrix matrix = pcaModel.pc();
    }
    
    Run Code Online (Sandbox Code Playgroud)
  2. ND4J:http://nd4j.org/doc/org/nd4j/linalg/dimensionalityreduction/PCA.html

    //Create points as NDArray instances
    List<INDArray> ndArrays = Arrays.asList(
            new NDArray(new float [] {-1.0F, -1.0F}),
            new NDArray(new float [] {-1.0F, 1.0F}),
            new NDArray(new float [] {1.0F, 1.0F}));
    
    //Create matrix of points (rows are observations; columns are features)
    INDArray matrix = new NDArray(ndArrays, new int [] {3,2});
    
    //Execute PCA - again to 2 dimensions
    INDArray factors = PCA.pca_factor(matrix, 2, false);
    
    Run Code Online (Sandbox Code Playgroud)
  3. Apache Commons Math(单线程;无框架)

    //create points in a double array
    double[][] pointsArray = new double[][] { 
        new double[] { -1.0, -1.0 }, 
        new double[] { -1.0, 1.0 },
        new double[] { 1.0, 1.0 } };
    
    //create real matrix
    RealMatrix realMatrix = MatrixUtils.createRealMatrix(pointsArray);
    
    //create covariance matrix of points, then find eigen vectors
    //see https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues
    
    Covariance covariance = new Covariance(realMatrix);
    RealMatrix covarianceMatrix = covariance.getCovarianceMatrix();
    EigenDecomposition ed = new EigenDecomposition(covarianceMatrix);
    
    Run Code Online (Sandbox Code Playgroud)

注意,奇异值分解(也可用于查找主成分)具有等效的实现.


NPE*_*NPE 7

这是一个:PCA类.

此类包含具有varimax旋转的基本主成分分析所需的方法.选项可用于使用协方差或相关martix进行分析.使用蒙特卡罗模拟进行并行分析.基于特征值大于1,大于蒙特卡罗特征值百分位数或大于蒙特卡罗特征值均值的提取标准是可用的.