use*_*665 7 python csv file-io
Eprime输出一个.txt文件,如下所示:
*** Header Start ***
VersionPersist: 1
LevelName: Session
Subject: 7
Session: 1
RandomSeed: -1983293234
Group: 1
Display.RefreshRate: 59.654
*** Header End ***
Level: 2
*** LogFrame Start ***
MeansEffectBias: 7
Procedure: trialProc
itemID: 7
bias1Answer: 1
*** LogFrame End ***
Level: 2
*** LogFrame Start ***
MeansEffectBias: 2
Procedure: trialProc
itemID: 2
bias1Answer: 0
我想解析它并将其写入.csv文件,但删除了多行.
我试图创建一个字典,将冒号前面的文本作为键,然后将文本作为值:
{subject: [7, 7], bias1Answer : [1, 0], itemID: [7, 2]}
def load_data(filename):
data = {}
eprime = open(filename, 'r')
for line in eprime:
rows = re.sub('\s+', ' ', line).strip().split(':')
try:
data[rows[0]] += rows[1]
except KeyError:
data[rows[0]] = rows[1]
eprime.close()
return data
for line in open(fileName, 'r'):
if ':' in line:
row = line.strip().split(':')
fullDict[row[0]] = row[1]
print fullDict
以下两个脚本都会产生垃圾:
{'\x00\t\x00M\x00e\x00a\x00n\x00s\x00E\x00f\x00f\x00e\x00c\x00t\x00B\x00i\x00a\x00s\x00': '\x00 \x005\x00\r\x00', '\x00\t\x00B\x00i\x00a\x00s\x002\x00Q\x00.\x00D\x00u\x00r\x00a\x00t\x00i\x00o\x00n\x00E\x00r\x00r\x00o\x00r\x00': '\x00 \x00-\x009\x009\x009\x009\x009\x009\x00\r\x00'
如果我可以设置字典,我可以将它写入一个看起来像这样的csv文件!!:
Subject itemID ... bias1Answer 7 7 1 7 2 0
您不需要创建字典.
import codecs
import csv
with codecs.open('eprime.txt', encoding='utf-16') as f, open('output.csv', 'w') as fout:
writer = csv.writer(fout, delimiter='\t')
writer.writerow(['Subject', 'itemID', 'bias1Answer'])
for line in f:
if ':' in line:
value = line.split()[-1]
if 'Subject:' in line:
subject = value
elif 'itemID:' in line:
itemID = value
elif 'bias1Answer:' in line:
bias1Answer = value
writer.writerow([subject, itemID, bias1Answer])
Run Code Online (Sandbox Code Playgroud)