我有一个包含大约2000条记录的CSV文件.
每条记录都有一个字符串和一个类别.
This is the first line, Line1
This is the second line, Line2
This is the third line, Line3
Run Code Online (Sandbox Code Playgroud)
我需要将此文件读入一个看起来像这样的列表;
List = [('This is the first line', 'Line1'),
('This is the second line', 'Line2'),
('This is the third line', 'Line3')]
Run Code Online (Sandbox Code Playgroud)
如何将此导入csv到我需要使用Python的列表中?
Mac*_*Gol 279
使用csv模块(Python 2.x):
import csv
with open('file.csv', 'rb') as f:
reader = csv.reader(f)
your_list = list(reader)
print your_list
# [['This is the first line', 'Line1'],
# ['This is the second line', 'Line2'],
# ['This is the third line', 'Line3']]
Run Code Online (Sandbox Code Playgroud)
如果你需要元组:
import csv
with open('test.csv', 'rb') as f:
reader = csv.reader(f)
your_list = map(tuple, reader)
print your_list
# [('This is the first line', ' Line1'),
# ('This is the second line', ' Line2'),
# ('This is the third line', ' Line3')]
Run Code Online (Sandbox Code Playgroud)
Python 3.x版本(下面是@seokhoonlee)
import csv
with open('file.csv', 'r') as f:
reader = csv.reader(f)
your_list = list(reader)
print(your_list)
# [['This is the first line', 'Line1'],
# ['This is the second line', 'Line2'],
# ['This is the third line', 'Line3']]
Run Code Online (Sandbox Code Playgroud)
seo*_*lee 50
Python3的更新:
import csv
with open('file.csv', 'r') as f:
reader = csv.reader(f)
your_list = list(reader)
print(your_list)
# [['This is the first line', 'Line1'],
# ['This is the second line', 'Line2'],
# ['This is the third line', 'Line3']]
Run Code Online (Sandbox Code Playgroud)
Mar*_*oma 38
熊猫非常擅长处理数据.以下是如何使用它的一个示例:
import pandas as pd
# Read the CSV into a pandas data frame (df)
# With a df you can do many things
# most important: visualize data with Seaborn
df = pd.read_csv('filename.csv', delimiter=',')
# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]
# or export it as a list of dicts
dicts = df.to_dict().values()
Run Code Online (Sandbox Code Playgroud)
一个很大的优点是pandas会自动处理标题行.
如果你还没有听说过Seaborn,我建议看一下.
import pandas as pd
# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()
# Convert
dicts = df.to_dict('records')
Run Code Online (Sandbox Code Playgroud)
df的内容是:
country population population_time EUR
0 Germany 82521653.0 2016-12-01 True
1 France 66991000.0 2017-01-01 True
2 Indonesia 255461700.0 2017-01-01 False
3 Ireland 4761865.0 NaT True
4 Spain 46549045.0 2017-06-01 True
5 Vatican NaN NaT True
Run Code Online (Sandbox Code Playgroud)
dicts的内容是
[{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
{'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
{'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
{'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
{'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
{'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]
Run Code Online (Sandbox Code Playgroud)
import pandas as pd
# Get data - reading the CSV file
import mpu.pd
df = mpu.pd.example_df()
# Convert
tuples = [[row[col] for col in df.columns] for row in df.to_dict('records')]
Run Code Online (Sandbox Code Playgroud)
内容tuples是:
[['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
['Ireland', 4761865.0, NaT, True],
['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
['Vatican', nan, NaT, True]]
Run Code Online (Sandbox Code Playgroud)
import csv
from pprint import pprint
with open('text.csv', newline='') as file:
reader = csv.reader(file)
l = list(map(tuple, reader))
pprint(l)
[('This is the first line', ' Line1'),
('This is the second line', ' Line2'),
('This is the third line', ' Line3')]
Run Code Online (Sandbox Code Playgroud)
如果csvfile是文件对象,则应使用打开newline=''。
CSV模组
小智 5
result = []
for line in text.splitlines():
result.append(tuple(line.split(",")))
Run Code Online (Sandbox Code Playgroud)
您可以使用该list()函数将 csv 读取器对象转换为列表
import csv
with open('input.csv', newline='') as csv_file:
reader = csv.reader(csv_file, delimiter=',')
rows = list(reader)
print(rows)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
421171 次 |
| 最近记录: |