从 NETCDF 文件中提取数据的有效方法

Question

从 NETCDF 文件中提取数据的有效方法

Sej*_*eji 7 python netcdf nco python-xarray cdo-climate

我有许多坐标（大约 20000 个），我需要从许多 NetCDF 文件中提取数据，每个文件大约有 30000 个时间步长（未来的气候情景）。使用此处的解决方案效率不高，原因是每个 i,j 将“dsloc”转换为“dataframe”所花费的时间（请查看下面的代码）。** 可以从此处下载 NetCDF 文件示例**

import pandas as pd
import xarray as xr
import time

#Generate some coordinates
coords_data = [{'lat': 68.04, 'lon': 15.20, 'stid':1},
    {'lat':67.96, 'lon': 14.95, 'stid': 2}]
crd= pd.DataFrame(coords_data)
lat = crd["lat"]
lon = crd["lon"]
stid=crd["stid"]

NC = xr.open_dataset(nc_file)
point_list = zip(lat,lon,stid)
start_time = time.time()
for i,j,id in point_list:
    print(i,j)
    dsloc = NC.sel(lat=i,lon=j,method='nearest')
    print("--- %s seconds ---" % (time.time() - start_time))
    DT=dsloc.to_dataframe()
    DT.insert(loc=0,column="station",value=id)
    DT.reset_index(inplace=True)
    temp=temp.append(DT,sort=True)
    print("--- %s seconds ---" % (time.time() - start_time))

Run Code Online (Sandbox Code Playgroud)

结果是：

68.04 15.2
--- 0.005853414535522461 seconds ---
--- 9.02660846710205 seconds ---
67.96 14.95
--- 9.028568267822266 seconds ---
--- 16.429600715637207 seconds ---

Run Code Online (Sandbox Code Playgroud)

这意味着每个 i,j 需要大约 9 秒来处理。给定大量具有大时间步长的坐标和 netcdf 文件，我想知道是否有一种 Pythonic 方法可以优化代码。我还可以使用 CDO 和 NCO 操作员，但我也发现使用它们时存在类似的问题。

Answer 1

Mic*_*ado 8

这是使用 DataArray 索引的 xarray高级索引的完美用例。

# Make the index on your coordinates DataFrame the station ID,
# then convert to a dataset.
# This results in a Dataset with two DataArrays, lat and lon, each
# of which are indexed by a single dimension, stid
crd_ix = crd.set_index('stid').to_xarray()

# now, select using the arrays, and the data will be re-oriented to have
# the data only for the desired pixels, indexed by 'stid'. The
# non-indexing coordinates lat and lon will be indexed by (stid) as well.
NC.sel(lon=crd_ix.lon, lat=crd_ix.lat, method='nearest')

Run Code Online (Sandbox Code Playgroud)

数据中的其他维度将被忽略，因此如果原始数据具有维度，则(lat, lon, z, time)新数据将具有维度(stid, z, time)。

归档时间：	4 年，5 月前
查看次数：	2535 次
最近记录：	4 年，1 月前