hsq*_*red 4 python dataframe pandas
我确信这是一件简单的事情,但我是 Python 新手,无法解决!
我有一个包含坐标的列的数据框,我想删除括号并将纬度/经度值添加到单独的列中。
当前数据框:
gridReference
(56.37769816725615, -4.325049868061924)
(56.37769816725615, -4.325049868061924)
(51.749167440074324, -4.963575226888083)
Run Code Online (Sandbox Code Playgroud)
想要的数据框:
Latitude Longitude
56.37769816725615 -4.325049868061924
56.37769816725615 -4.325049868061924
51.749167440074324 -4.963575226888083
Run Code Online (Sandbox Code Playgroud)
谢谢你的帮助
编辑:我试过:
df['lat'], df['lon'] = df.gridReference.str.strip(')').str.strip('(').str.split(', ').values.tolist()
但我收到错误:
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas
然后我尝试添加:
df['gridReference'] = df['gridReference'].astype('str')
并得到错误:
ValueError: too many values to unpack (expected 2)
任何帮助将不胜感激,因为我不知道如何使这项工作!:)
编辑:
我不断收到错误
AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas
df.dtypes 的输出是:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 22899 entries, 0 to 22898
Data columns (total 1 columns):
LatLon 22899 non-null object
dtypes: object(1)
df.info() 的输出是:
gridReference object
dtype: object
df['gridReference'].str.strip('()') \
.str.split(', ', expand=True) \
.rename(columns={0:'Latitude', 1:'Longitude'})
Latitude Longitude
0 56.37769816725615 -4.325049868061924
1 56.37769816725615 -4.325049868061924
2 51.749167440074324 -4.963575226888083
Run Code Online (Sandbox Code Playgroud)
>>> df = pd.DataFrame({'latlong': ['(12, 32)', '(43, 54)']})
>>> df
latlong
0 (12, 32)
1 (43, 54)
>>> split_data = df.latlong.str.strip(')').str.strip('(').str.split(', ')
>>> df['lat'] = split_data.apply(lambda x: x[0])
>>> df['long'] = split_data.apply(lambda x: x[1])
latlong lat long
0 (12, 32) 12 43
1 (43, 54) 32 54
Run Code Online (Sandbox Code Playgroud)