Jop*_*ppy 6 pandas python-xarray
有没有一种简单的方法可以将 xarray DataArray 转换为 pandas DataFrame,我可以在其中指定将哪些维度转换为索引/列?例如,假设我有一个 DataArray
import xarray as xr
weather = xr.DataArray(
name='weather',
data=[['Sunny', 'Windy'], ['Rainy', 'Foggy']],
dims=['date', 'time'],
coords={
'date': ['Thursday', 'Friday'],
'time': ['Morning', 'Afternoon'],
}
)
Run Code Online (Sandbox Code Playgroud)
结果是:
<xarray.DataArray 'weather' (date: 2, time: 2)>
array([['Sunny', 'Windy'],
['Rainy', 'Foggy']], dtype='<U5')
Coordinates:
* date (date) <U8 'Thursday' 'Friday'
* time (time) <U9 'Morning' 'Afternoon'
Run Code Online (Sandbox Code Playgroud)
假设我现在想将其移动到按日期索引的 pandas DataFrame,其中包含时间列。我可以通过使用.to_dataframe()然后.unstack()在生成的数据帧上来做到这一点:
<xarray.DataArray 'weather' (date: 2, time: 2)>
array([['Sunny', 'Windy'],
['Rainy', 'Foggy']], dtype='<U5')
Coordinates:
* date (date) <U8 'Thursday' 'Friday'
* time (time) <U9 'Morning' 'Afternoon'
Run Code Online (Sandbox Code Playgroud)
然而,pandas 会对事情进行排序,所以我得到的不是“上午”然后是“下午”,而是“下午”然后是“上午”。我更希望有一个像这样的API
>>> weather.to_dataframe().unstack()
weather
time Afternoon Morning
date
Friday Foggy Rainy
Thursday Windy Sunny
Run Code Online (Sandbox Code Playgroud)
它可以为我进行这种重塑,而无需我事后重新排序我的索引和列。
在 xarray 0.16.1 中,dim_order被添加到.to_dataframe. 这符合您的要求吗?
xr.DataArray.to_dataframe(
self,
name: Hashable = None,
dim_order: List[Hashable] = None,
) -> pandas.core.frame.DataFrame
Docstring:
Convert this array and its coordinates into a tidy pandas.DataFrame.
The DataFrame is indexed by the Cartesian product of index coordinates
(in the form of a :py:class:`pandas.MultiIndex`).
Other coordinates are included as columns in the DataFrame.
Parameters
----------
name
Name to give to this array (required if unnamed).
dim_order
Hierarchical dimension order for the resulting dataframe.
Array content is transposed to this order and then written out as flat
vectors in contiguous order, so the last dimension in this list
will be contiguous in the resulting DataFrame. This has a major
influence on which operations are efficient on the resulting
dataframe.
If provided, must include all dimensions of this DataArray. By default,
dimensions are sorted according to the DataArray dimensions order.
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6928 次 |
| 最近记录: |