我有一个包含多个文件的文件夹,每个文件在每个文件中都有不同数量的列.我想浏览目录,打开每个文件并遍历每一行,根据该行中的列数将行写入新的CSV文件.我想最终得到一个包含14列的所有行的大CSV,另一个包含18列的所有行的大CSV,以及包含所有其他列的最后一个CSV.
这是我到目前为止所拥有的.
import pandas as pd
import glob
import os
import csv
path = r'C:\Users\Vladimir\Documents\projects\ETLassig\W3SVC2'
all_files = glob.glob(os.path.join(path, "*.log"))
for file in all_files:
for line in file:
if len(line.split()) == 14:
with open('c14.csv', 'wb') as csvfile:
csvwriter = csv.writer(csvfile, delimiter=' ')
csvwriter.writerow([line])
elif len(line.split()) == 18:
with open('c14.csv', 'wb') as csvfile:
csvwriter = csv.writer(csvfile, delimiter=' ')
csvwriter.writerow([line])
#open 18.csv
else:
with open('misc.csv', 'wb') as csvfile:
csvwriter = csv.writer(csvfile, delimiter=' ')
csvwriter.writerow([line])
print(c14.csv)
Run Code Online (Sandbox Code Playgroud)
任何人都可以提供有关如何处理此问题的任何反馈
您可以将所有列添加为列表中的列表:
l = []
for file in [your_files]:
with open(file, 'r') as f:
for line in f.readlines()
l.appned(line.split(" "))
Run Code Online (Sandbox Code Playgroud)
现在您有列表列表,因此只需将它们与子列表的长度排序,然后将其放入新文件中:
l.sort(key=len)
with open(outputfile, 'w'):
# Write lines here as you want
Run Code Online (Sandbox Code Playgroud)