将float Series转换为pandas中的整数Series

Question

将float Series转换为pandas中的整数Series

我有以下数据框:

In [31]: rise_p
Out[31]: 
         time    magnitude
0  1379945444   156.627598
1  1379945447  1474.648726
2  1379945448  1477.448999
3  1379945449  1474.886202
4  1379945699  1371.454224

Run Code Online (Sandbox Code Playgroud)

现在,我想对一分钟内的行进行分组.所以我把时间序列除以100.我得到这个:

In [32]: rise_p/100
Out[32]: 
          time  magnitude
0  13799454.44   1.566276
1  13799454.47  14.746487
2  13799454.48  14.774490
3  13799454.49  14.748862
4  13799456.99  13.714542

Run Code Online (Sandbox Code Playgroud)

如上所述,我想根据时间创建组.因此,预期的子组将是具有时间13799454和行的行13799456.我这样做:

In [37]: ts = rise_p['time']/100

In [38]: s = rise_p/100

In [39]: new_re_df = [s.iloc[np.where(int(ts) == int(i))] for i in ts]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-5ea498cf32b2> in <module>()
----> 1 new_re_df = [s.iloc[np.where(int(ts) == int(i))] for i in ts]

TypeError: only length-1 arrays can be converted to Python scalars

Run Code Online (Sandbox Code Playgroud)

如何转换ts为整数系列,因为int()不会将Series或列表作为参数？大熊猫有没有这样做的方法呢？

Answer 1

dre*_*iya 14

尝试使用astype进行转换:

new_re_df = [s.iloc[np.where(ts.astype(int) == int(i))] for i in ts]

Run Code Online (Sandbox Code Playgroud)

编辑

根据@Rutger Kassies的建议,一个更好的方法是投射系列,然后组合:

rise_p['ts'] = (rise_p.time / 100).astype('int')

ts_grouped = rise_p.groupby('ts')

...

Run Code Online (Sandbox Code Playgroud)

使用`astype()`肯定是正确的,但是一起避免列表理解会更好.比如`ts ['time'] =(ts.time/100).astype('int')`然后用`ts.grouby('time')`等分组......等等...... (3认同)

归档时间：	12 年，6 月前
查看次数：	24595 次
最近记录：	7 年，2 月前