我有一个df像这样的DataFrame :
Pattern String
101 hi, how are you?
104 what are you doing?
108 Python is good to learn.
Run Code Online (Sandbox Code Playgroud)
我想为字符串列创建ngrams。我已经使用split()和创建了unigramstack()
new= df.String.str.split(expand=True).stack()
Run Code Online (Sandbox Code Playgroud)
但是,我想创建ngram(bi,tri,quad等)
据帧(TEST1):
cons_flag
Mas
Mas
Wood
Wood
Wood
Mas
Conc
Wood
Run Code Online (Sandbox Code Playgroud)
OUTPUT:
cons_flag new_var
Mas MASOM
Mas MASOM
Wood MASOM
Wood MASOM
Wood MASOM
Mas MASOM
Conc MASOM
Wood MASOM
Run Code Online (Sandbox Code Playgroud)
使用代码:
for x in test1['cons_flag']:
if x.find('Mas'):
test1['new_var']="MASOM"
elif x.find('Wood'):
test1['new_var']= "WOODEN"
Run Code Online (Sandbox Code Playgroud)
我的问题是new_var列值不按照我的逻辑更新.