Nab*_*zir 4 python text non-alphanumeric feature-extraction pandas
这是我的数据
No Body
1 DaTa, Analytics 2
2 StackOver. 67%
Run Code Online (Sandbox Code Playgroud)
这是我的预期输出
No Body Non Alphanumeric
1 DaTa, Analytics 2 1
2 StackOver. 67% 2
Run Code Online (Sandbox Code Playgroud)
我只计算非字母数字,如! @ # & ( ) % – [ { } ] : ; ', ? / * 空格,数字不计算
您可以使用:
df['Non Alphanumeric'] = df['Body'].str.findall(r'[^a-zA-Z0-9 ]').str.len()
Run Code Online (Sandbox Code Playgroud)
或者:
df['Non Alphanumeric'] = df['Body'].str.count(r'[^a-zA-Z0-9 ]')
print (df)
No Body Non Alphanumeric
0 1 DaTa, Analytics 2 1
1 2 StackOver. 67% 2
Run Code Online (Sandbox Code Playgroud)