Nab*_*zir 6 python string dataframe pandas
我如何根据数字的前几个数字切割字符串
这是我的资料
Id actual_pattern
1 100101
2 10101
3 1010101
4 101
Run Code Online (Sandbox Code Playgroud)
这是预期的输出
for cut_pattern1是from的前4位数字actual_pattern
,cut_pattern2是from的余数形式 cut_pattern1,如果from的余 cut_pattern1数不存在,则使cut_pattern2= 0
如果1in cut_pattern2,则make binary_cut2= 1,否则make binary_cut2= 0
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
1 100101 1001 01 1
2 10101 1010 1 1
3 1010101 1010 101 1
4 101 101 0 0
Run Code Online (Sandbox Code Playgroud)
通过使用索引创建新列str,replace以更改空字符串,以及将新列用于Series.str.contains强制转换为整数:
df['actual_pattern'] = df['actual_pattern'].astype(str)
df['cut_pattern1'] = df['actual_pattern'].str[:4]
df['cut_pattern2'] = df['actual_pattern'].str[4:].replace('','0')
df['binary_cut2'] = df['cut_pattern2'].str.contains('1').astype(int)
print (df)
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
0 1 100101 1001 01 1
1 2 10101 1010 1 1
2 3 1010101 1010 101 1
3 4 101 101 0 0
Run Code Online (Sandbox Code Playgroud)
编辑:
@Rick Hitchcock的解决方案,来自评论:
df['actual_pattern'] = df['actual_pattern'].astype(str)
df['cut_pattern1'] = df['actual_pattern'].str[:4]
df['cut_pattern2'] = df['actual_pattern'].str[4:].replace('','0')
df['binary_cut2'] = df['cut_pattern2'].str.contains('1').astype(int)
print (df)
Id actual_pattern cut_pattern1 cut_pattern2 binary_cut2
0 1 100101 1001 01 1
1 2 10101 1010 1 1
2 3 1010101 1010 101 1
3 4 00001111 0000 1111 1
Run Code Online (Sandbox Code Playgroud)