将“时间”维度添加到 xarray 数据集并将另一个数据集中的坐标分配给它

ale*_*xtc 4 numpy python-xarray

我有一个名为 的数据集对象(通过 netCDF 文件导入xarray.open_datasetds。它包含一个名为variable1和的变量latitudelongitude维度。

>>>ds
<xarray.Dataset>
Dimensions:    (latitude: 681, longitude: 841)
Coordinates:
  * latitude   (latitude) float64 -10.0 -10.05 -10.1 ... -43.9 -43.95 -44.0
  * longitude  (longitude) float64 112.0 112.0 112.1 112.2 ... 153.9 153.9 154.0
Data variables:
    variable1     (latitude, longitude) float32 ...
Run Code Online (Sandbox Code Playgroud)

我有一个timeDataArray 对象,其坐标从2017-01-012017-13-31

>>>times = pd.date_range("2017/01/01","2018/01/01",freq='D',closed='left')
>>>time_da = xr.DataArray(times, [('time', times)])
>>>time_da
<xarray.DataArray (time: 365)>
array(['2017-01-01T00:00:00.000000000', '2017-01-02T00:00:00.000000000',
       '2017-01-03T00:00:00.000000000', ..., '2017-12-29T00:00:00.000000000',
       '2017-12-30T00:00:00.000000000', '2017-12-31T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31
Run Code Online (Sandbox Code Playgroud)

我想添加一个名为的新维度time并为其分配坐标,time_da以便新数据集ds2如下所示:

>>>ds2
<xarray.Dataset>
Dimensions:    (latitude: 681, longitude: 841, time: 365)
Coordinates:
  * longitude  (longitude) float64 112.0 112.0 112.1 112.2 ... 153.9 153.9 154.0
  * latitude   (latitude) float64 -10.0 -10.05 -10.1 ... -43.9 -43.95 -44.0
  * time       (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31
Data variables:
    sm_pct     (time, latitude, longitude) float32 nan nan nan ... nan nan nan
Run Code Online (Sandbox Code Playgroud)

这意味着原始 DataArray [纬度,经度] 将在维度的整个时间段内重复 365 次time

我尝试使用ds.expand_dims创建time维度并分配time_da给它,但这不起作用。错误是:

>>> ds2 = ds.expand_dims(dim='time', axis=0)
>>> ds2.coords['time'] = ('time',time_da)
ValueError: conflicting sizes for dimension 'time': length 1 on <this-array> and length 730 on 'time'
Run Code Online (Sandbox Code Playgroud)

pai*_*ime 6

有一个适合expand_dims您的用法:

>>> dst = ds.expand_dims(time=time_da)
>>> dst
<xarray.Dataset>
Dimensions:    (latitude: 681, longitude: 841, time: 365)
Coordinates:
  * time       (time) datetime64[ns] 2017-01-01 2017-01-02 ... 2017-12-31
  * latitude   (latitude) int64 0 1 2 3 4 5 6 7 ... 674 675 676 677 678 679 680
  * longitude  (longitude) int64 0 1 2 3 4 5 6 7 ... 834 835 836 837 838 839 840
Data variables:
    variable   (time, latitude, longitude) float64 0.03968 2.156 ... -1.752
Run Code Online (Sandbox Code Playgroud)

检查variable每个时间步是否相同:

>>> np.all(np.diff(dst["variable"], axis=0) == 0)
True
Run Code Online (Sandbox Code Playgroud)