从数据框的新列中的索引中提取数据

Giu*_*ppe 3 python indexing numpy pandas

如何根据不同列中的索引值提取数据?

到目前为止,我已经能够基于同一列(5块)中的索引号提取数据。

数据框如下所示:

3017     39517.3886
3018     39517.4211
3019     39517.4683
3020     39517.5005
3021     39517.5486
5652     39628.1622
5653     39628.2104
5654     39628.2424
5655     39628.2897
5656     39628.3229
5677     39629.2020
5678     39629.2342
5679     39629.2825
5680     39629.3304
5681     39629.3628
Run Code Online (Sandbox Code Playgroud)

col中提取的数据在索引值周围+/- 2行

我想要一些看起来像这样的东西:

  3017-3021   5652-5656   5677-5681
1 39517.3886  39628.1622  39629.2020
2 39517.4211  39628.2104  39629.2342
3 39517.4683  39628.2424  39629.2825
4 39517.5005  39628.2897  39629.3304
5 39517.5486  39628.3229  39629.3628
Run Code Online (Sandbox Code Playgroud)

依我要提取的数据数量而定。

我用来基于索引提取数据的代码是:

## find index based on the first 0 of a 000 - 111 list
a = stim_epoc[1:]
ss = [(num+1) for num,i in enumerate(zip(stim_epoc,a)) if i == (0,1)]

## extract data from a df (GCamp_ps) based on the previous index 'ss'
fin = [i for x in ss for i in range(x-2, x + 2 + 1) if i in range(len(GCaMP_ps))]
df = time_fip.loc[np.unique(fin)]
print(df)
Run Code Online (Sandbox Code Playgroud)

ALo*_*llz 5

5个连续行的表单组(因为您从中心拉出+/- 2行)。然后创建列和索引标签,然后pivot

df = df.reset_index()
s = df.index//5   # If always 5 consecutive values. I.e. +/-2 rows from a center.    

df['col'] = df.groupby(s)['index'].transform(lambda x: '-'.join(map(str, x.agg(['min', 'max']))))
df['idx'] = df.groupby(s).cumcount()

df.pivot(index='idx', columns='col', values=0)  # Assuming column named `0`
Run Code Online (Sandbox Code Playgroud)

输出:

col   3017-3021   5652-5656   5677-5681
idx                                    
0    39517.3886  39628.1622  39629.2020
1    39517.4211  39628.2104  39629.2342
2    39517.4683  39628.2424  39629.2825
3    39517.5005  39628.2897  39629.3304
4    39517.5486  39628.3229  39629.3628
Run Code Online (Sandbox Code Playgroud)

  • 聪明的解决方案!我喜欢! (4认同)