ele*_*aby 0 python string format numpy genfromtxt
我有制表符分隔文件(city-data.txt):
Alabama Montgomery 32.361538 -86.279118
Alaska Juneau 58.301935 -134.41974
Run Code Online (Sandbox Code Playgroud)
有可能以某种方式读取前两列作为字符串,最后两列作为浮点数?
我的输出应该如下所示:
[(Alabama,Montgomery,32.36,-86.28),
(Alaska,Juneau,58.30,-134.42)]
Run Code Online (Sandbox Code Playgroud)
我试过了:
mylist2=np.genfromtxt(r'city-data.txt', delimiter='\t', dtype=("<S15","
<S15", float, float)).tolist()
Run Code Online (Sandbox Code Playgroud)
这给了我字节类型的前两列:
[(b'Alabama', b'Montgomery', 32.361538, -86.279118),
(b'Alaska', b'Juneau', 58.301935, -134.41974)]
Run Code Online (Sandbox Code Playgroud)
我也尝试过:
with open('city-data.txt') as f:
mylist = [tuple(i.strip().split('\t')) for i in f]
Run Code Online (Sandbox Code Playgroud)
这给了我字符串类型的所有列:
[('Alabama', 'Montgomery', '32.361538', '-86.279118'),
('Alaska', 'Juneau', '58.301935', '-134.41974')]
Run Code Online (Sandbox Code Playgroud)
我无法想出如何实现我需要的东西......
您可以使用pandas read_csv将文件内容读入数据帧.然后将行转换为您使用指定的列表df.values.tolist().
例:
import pandas as pd
df = pd.read_csv(filename, sep="\t", header=None)
print(df.values.tolist())
#[['Alabama', 'Montgomery', 32.361538, -86.27911800000001],
# ['Alaska', 'Juneau', 58.301935, -134.41974]]
Run Code Online (Sandbox Code Playgroud)
如果你需要它们作为元组,只需使用map():
print(map(tuple, df.values.tolist()))
#[('Alabama', 'Montgomery', 32.361538, -86.27911800000001),
# ('Alaska', 'Juneau', 58.301935, -134.41974)]
Run Code Online (Sandbox Code Playgroud)
编辑
如果您想使用numpy,对现有代码的这种轻微修改应该有效.dtype将文本字段更改为"O":
mylist2=np.genfromtxt(filename delimiter='\t', dtype=("O","O", float, float)).tolist()
#[('Alabama', 'Montgomery', 32.361538, -86.279118),
# ('Alaska', 'Juneau', 58.301935, -134.41974)]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
763 次 |
| 最近记录: |