我尝试加入两个csv文件,其中key是第一列的值.没有标题.
文件具有不同数量的行和行.
必须保留文件的顺序.
档案a:
john,red,34
andrew,green,18
tonny,black,50
jack,yellow,27
phill,orange,45
kurt,blue,29
mike,pink,61
Run Code Online (Sandbox Code Playgroud)
文件b:
tonny,driver,new york
phill,scientist,boston
Run Code Online (Sandbox Code Playgroud)
期望的结果:
john,red,34
andrew,green,18
tonny,black,50,driver,new york
jack,yellow,27
phill,orange,45,scientist,boston
kurt,blue,29
mike,pink,61
Run Code Online (Sandbox Code Playgroud)
我检查了所有相关的线程,我相信你们中的一些人会将这个问题重复,但我还没有找到解决方案.
我抓住了基于字典的解决方案,但这种方法不处理从文件'a'条件保留行顺序.
import csv
from collections import defaultdict
with open('a.csv') as f:
r = csv.reader(f, delimiter=',')
dict1 = {}
for row in r:
dict1.update({row[0]: row[1:]})
with open('b.csv') as f:
r = csv.reader(f, delimiter=',')
dict2 = {}
for row in r:
dict2.update({row[0]: row[1:]})
result = defaultdict(list)
for d in (dict1, dict2):
for key, value in d.iteritems():
result[key].append(value)
Run Code Online (Sandbox Code Playgroud)
我也想避免将这些csv文件放到数据库中,如sqlite或使用pandas模块.
提前致谢
就像是
import csv
from collections import OrderedDict
with open('b.csv', 'rb') as f:
r = csv.reader(f)
dict2 = {row[0]: row[1:] for row in r}
with open('a.csv', 'rb') as f:
r = csv.reader(f)
dict1 = OrderedDict((row[0], row[1:]) for row in r)
result = OrderedDict()
for d in (dict1, dict2):
for key, value in d.iteritems():
result.setdefault(key, []).extend(value)
with open('ab_combined.csv', 'wb') as f:
w = csv.writer(f)
for key, value in result.iteritems():
w.writerow([key] + value)
Run Code Online (Sandbox Code Playgroud)
产生
john,red,34
andrew,green,18
tonny,black,50,driver,new york
jack,yellow,27
phill,orange,45,scientist,boston
kurt,blue,29
mike,pink,61
Run Code Online (Sandbox Code Playgroud)
(请注意,我没有费心去防止dict2有没有钥匙的情况dict1- 如果你愿意,可以轻松添加.)
| 归档时间: |
|
| 查看次数: |
10128 次 |
| 最近记录: |