Eli*_*ina 6 python time-series pandas
我有一个带有数据的 pandas.core.series.Series
0 [00115840, 00110005, 001000033, 00116000...
1 [00267285, 00263627, 00267010, 0026513...
2 [00335595, 00350750]
Run Code Online (Sandbox Code Playgroud)
我想从系列中删除前导零。我试过了
x.astype('int64')
Run Code Online (Sandbox Code Playgroud)
但收到错误信息
ValueError: setting an array element with a sequence.
Run Code Online (Sandbox Code Playgroud)
你能建议我如何在 python 3.x 中做到这一点吗?
如果想要将strings 列表转换为integerss 列表,请使用list comprehension:
s = pd.Series([[int(y) for y in x] for x in s], index=s.index)\nRun Code Online (Sandbox Code Playgroud)\n\n\n\ns = s.apply(lambda x: [int(y) for y in x])\nRun Code Online (Sandbox Code Playgroud)\n\n样本:
\n\na = [['00115840', '00110005', '001000033', '00116000'],\n ['00267285', '00263627', '00267010', '0026513'],\n ['00335595', '00350750']]\n\ns = pd.Series(a)\nprint (s)\n0 [00115840, 00110005, 001000033, 00116000]\n1 [00267285, 00263627, 00267010, 0026513]\n2 [00335595, 00350750]\ndtype: object\n\ns = s.apply(lambda x: [int(y) for y in x])\nprint (s)\n0 [115840, 110005, 1000033, 116000]\n1 [267285, 263627, 267010, 26513]\n2 [335595, 350750]\ndtype: object\nRun Code Online (Sandbox Code Playgroud)\n\n编辑:
\n\n如果integer只想要,您可以展平值并转换为ints:
s = pd.Series([item for sublist in s for item in sublist]).astype(int)\nRun Code Online (Sandbox Code Playgroud)\n\n替代解决方案:
\n\nimport itertools\ns = pd.Series(list(itertools.chain(*s))).astype(int)\n\nprint (s)\n0 115840\n1 110005\n2 1000033\n3 116000\n4 267285\n5 263627\n6 267010\n7 26513\n8 335595\n9 350750\ndtype: int32\nRun Code Online (Sandbox Code Playgroud)\n\n时间安排:
\n\na = [['00115840', '00110005', '001000033', '00116000'],\n ['00267285', '00263627', '00267010', '0026513'],\n ['00335595', '00350750']]\n\ns = pd.Series(a)\ns = pd.concat([s]*1000).reset_index(drop=True)\nRun Code Online (Sandbox Code Playgroud)\n\n\n\nIn [203]: %timeit pd.Series([[int(y) for y in x] for x in s], index=s.index)\n100 loops, best of 3: 4.66 ms per loop\n\nIn [204]: %timeit s.apply(lambda x: [int(y) for y in x])\n100 loops, best of 3: 5.13 ms per loop\n\n#c\xe1\xb4\x8f\xca\x9f\xe1\xb4\x85s\xe1\xb4\x98\xe1\xb4\x87\xe1\xb4\x87\xe1\xb4\x85 sol\nIn [205]: %%timeit\n ...: v = pd.Series(np.concatenate(s.values.tolist()))\n ...: v.astype(int).groupby(s.index.repeat(s.str.len())).agg(pd.Series.tolist)\n ...: \n1 loop, best of 3: 226 ms per loop\n\n#Wen solution\nIn [211]: %timeit pd.Series(s.apply(pd.Series).stack().astype(int).groupby(level=0).apply(list))\n1 loop, best of 3: 1.12 s per loop\nRun Code Online (Sandbox Code Playgroud)\n\n扁平化解决方案(@c\xe1\xb4\x8f\xca\x9f\xe1\xb4\x85s\xe1\xb4\x98\xe1\xb4\x87\xe1\xb4\x87\xe1\xb4\x85的想法):
\n\nIn [208]: %timeit pd.Series([item for sublist in s for item in sublist]).astype(int)\n100 loops, best of 3: 2.55 ms per loop\n\nIn [209]: %timeit pd.Series(list(itertools.chain(*s))).astype(int)\n100 loops, best of 3: 2.2 ms per loop\n\n#c\xe1\xb4\x8f\xca\x9f\xe1\xb4\x85s\xe1\xb4\x98\xe1\xb4\x87\xe1\xb4\x87\xe1\xb4\x85 sol\nIn [210]: %timeit pd.Series(np.concatenate(s.values.tolist()))\n100 loops, best of 3: 7.71 ms per loop\nRun Code Online (Sandbox Code Playgroud)\n