我已经获得了一个大型CSV文件,我需要将其删除以用于机器学习.我已经设法找到一种方法将文件拆分为我需要的2行 - 但我有一个问题.
我基本上有这样的文件结构.
"David", "Red"
"David", "Ford"
"David", "Blue"
"David", "Aspergers"
"Steve", "Red"
"Steve", "Vauxhall"
Run Code Online (Sandbox Code Playgroud)
我要求数据看起来更像......
"David, "Red", "Ford", "Blue", "Aspergers"
"Steve", "Red", "Vaxhaull"
Run Code Online (Sandbox Code Playgroud)
我目前有这个剥离CSV文件
import csv
cr = csv.reader(open("traits.csv","rb"), delimiter=',', lineterminator='\n')
cr.next() #skipping header line, no point in removing it as I need to standardise data manipuation.
# Print out the id of species and trait values
print 'Stripping input'
vals = [(row[1], row[4]) for row in cr]
print str(vals) + '\n'
with open("output.csv", "wb") as f:
writer …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用基于插入的排序算法对大型数据文件进行排序,代码运行正常,但输出不正确.我一遍又一遍地研究它完全无济于事,谁能看到我出错的地方?
public void sort(Comparable[] items) {
for (int i = 1; i < items.length; i++) {
Comparable temp = items[i];
int j = i - 1;
while (j >= 0 && items[j].compareTo(items[j]) > 0) {
items[j + 1] = items[j];
j = j - 1;
}
items[j] = temp;
}
}
Run Code Online (Sandbox Code Playgroud)
我制作的一个示例数据文件是......
2
1
3
5
9
6
7
4
8
Run Code Online (Sandbox Code Playgroud)
显然输出应该是1,2,3,4 ... - 但我得到1 3 5 9 6 7 4 8 8