use*_*654 -1 java arrays loops while-loop
我们目前正在尝试在java中实现kmeans算法.我们的问题是:
我们正在使用该getData()方法用文件中的数据填充二维数组.在方法的while循环中,getData()我们有一个println(),在return命令之前我们有另一个.
第一个println()给出了我们从文件中获得的正确值.
第二个println()只是为我们0.0提供了该数组中的每个字段,除了arrayList[299][0].
这是为什么?
class KMeans {
// Number of clusters
int numberOfClusters = 4;
// Starting point for each cluster (these values should be better than completely random values for our given data set)
static double[] a = new double[]{-1.5, 2.0};
static double[] b = new double[]{-1.5, 7.0};
static double[] c = new double[]{1.5, 7.0};
static double[] d = new double[]{1.5, 2.0};
static double[][] pointArray;
// This calculates the distance between a given point from the data set and a centroid
public static double calculateDistance(double[] point, double[] centroid) {
// get difference for X coordinates
double maxX = Math.max(point[0], centroid[0]);
double minX = Math.min(point[0], centroid[0]);
double differenceX = maxX - minX;
double differenceXSquared = Math.pow(differenceX, 2);
// get difference for Y coordinates
double maxY = Math.max(point[1], centroid[1]);
double minY = Math.min(point[1], centroid[1]);
double differenceY = maxY - minY;
double differenceYSquared = Math.pow(differenceY, 2);
// The whole thing is nothing other than pythagoras
double zSquared = differenceXSquared + differenceYSquared;
double z = Math.sqrt(zSquared);
return z;
}
// This calculates which of the given distances is the lowest
public static double[] nearestCluster(double e, double f, double g, double h) {
double x = Math.min(e, f);
double y = Math.min(x, g);
double z = Math.min(y, h);
if (z == e) {
return a;
}
if (z == f) {
return b;
}
if (z == g) {
return c;
} else {
return d;
}
}
// Read the file
public static double[][] getData() {
try (BufferedReader br = new BufferedReader(new FileReader("/home/john/Downloads/data.txt"))) {
String line;
int i = 1;
int j = 0;
while ((line = br.readLine()) != null) {
// Create the array in which we store each value
pointArray = new double[i][4];
//Splits each line a the space and writes it to an array
String[] split = line.split("\\s+");
// Cast the strings to double and write them to our pointArray
pointArray[j][0] = Double.parseDouble(split[0]);
pointArray[j][1] = Double.parseDouble(split[1]);
System.out.println(pointArray[0][0]);
i++;
j++;
}
} catch (IOException e) {
}
System.out.println(pointArray[0][0]);
return pointArray;
}
public static void main(String[] args) throws FileNotFoundException, IOException {
pointArray = getData();
for (double[] x : pointArray) {
double distanceA = calculateDistance(x, a);
double distanceB = calculateDistance(x, b);
double distanceC = calculateDistance(x, c);
double distanceD = calculateDistance(x, d);
// Assigns the closest cluster to each point (not too efficent because we call the function twice, but it works)
x[2] = nearestCluster(distanceA, distanceB, distanceC, distanceD)[0];
x[3] = nearestCluster(distanceA, distanceB, distanceC, distanceD)[1];
}
}
}
Run Code Online (Sandbox Code Playgroud)
这条线
pointArray = new double[i][4];
Run Code Online (Sandbox Code Playgroud)
每次循环时重新初始化数组.实际上,除了您读取的最后一行之外,您将丢弃所有值.
相反,使用a ArrayList来保存每一行.在while循环之前设置它像这样:
List<Double[]> pointList = new ArrayList<>();
Run Code Online (Sandbox Code Playgroud)
然后你可以在每一行添加它,如下所示:
Double[] points = new Double[4];
// ...
points[0] = Double.parseDouble(split[0]);
// etc.
pointList.add(points);
Run Code Online (Sandbox Code Playgroud)
然后返回pointList或将其转换为数组以便返回.