限制 DataFrame 中列中的字数

Sye*_*ail 3 split dataframe python-3.x pandas

我的数据框看起来像

      Abc                       XYZ 
0  Hello   How are you doing today
1   Good                 This is a
2    Bye                   See you
3  Books  Read chapter 1 to 5 only
Run Code Online (Sandbox Code Playgroud)

max_size = 3,我想将列(XYZ)截断为最大大小3个字(max_size)。有些行的长度小于 max_size,应该保持原样。

期望的输出:

     Abc                       XYZ
0  Hello               How are you
1   Good                 This is a
2    Bye                   See you
3  Books            Read chapter 1
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 7

使用带有限制的 split,删除最后一个值,然后将列表连接在一起:

max_size = 3

df['XYZ'] = df['XYZ'].str.split(n=max_size).str[:max_size].str.join(' ')
print (df)
     Abc             XYZ
0  Hello     How are you
1   Good       This is a
2    Bye         See you
3  Books  Read chapter 1
Run Code Online (Sandbox Code Playgroud)

另一个带有 lambda 函数的解决方案:

df['XYZ'] = df['XYZ'].apply(lambda x: ' '.join(x.split(maxsplit=max_size)[:max_size]))
Run Code Online (Sandbox Code Playgroud)