'dataframe' 对象没有属性 'str' 问题

bor*_*rfo 2 python pandas

我正在尝试删除包含某些字符串的行。但是,我收到错误:

pandas - “dataframe”对象没有属性“str”错误。

这是我的代码:

df = df[~df['colB'].str.contains('Example:')] 
Run Code Online (Sandbox Code Playgroud)

我怎样才能解决这个问题?

jez*_*ael 6

第一个问题应该是重复的列名,所以在 select colBget not 之后Series,但是DataFrame

df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colB','colB','colC'])
print (df)
         colB colB  colC
0  Example: s   as     2
1          dd  aaa     3

print (df['colB'])
         colB colB
0  Example: s   as
1          dd  aaa

#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'
Run Code Online (Sandbox Code Playgroud)

解决方案应该将列连接在一起:

print (df['colB'].apply(' '.join, axis=1))
0    Example: s as
1           dd aaa

df['colB'] = df.pop('colB').apply(' '.join, axis=1)
df = df[~df['colB'].str.contains('Example:')] 
print (df)
   colC    colB
1     3  dd aaa
Run Code Online (Sandbox Code Playgroud)

第二个问题应该是 hiddenMultiIndex:

df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colA','colB','colC'])
df.columns = pd.MultiIndex.from_arrays([df.columns])
print (df)
         colA colB colC
0  Example: s   as    2
1          dd  aaa    3

print (df['colB'])
  colB
0   as
1  aaa

#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'
Run Code Online (Sandbox Code Playgroud)

解决方案是重新分配第一级:

df.columns = df.columns.get_level_values(0)
df = df[~df['colB'].str.contains('Example:')] 
print (df)
         colA colB  colC
0  Example: s   as     2
1          dd  aaa     3
Run Code Online (Sandbox Code Playgroud)

第三个应该是MultiIndex

df = pd.DataFrame([['Example: s', 'as', 2], ['dd', 'aaa', 3]], columns=['colA','colB','colC'])
df.columns = pd.MultiIndex.from_product([df.columns, ['a']])
print (df)
         colA colB colC
            a    a    a
0  Example: s   as    2
1          dd  aaa    3

print (df['colB'])
     a
0   as
1  aaa

print (df.columns)
MultiIndex(levels=[['colA', 'colB', 'colC'], ['a']],
           codes=[[0, 1, 2], [0, 0, 0]])

#print (df['colB'].str.contains('Example:'))
#>AttributeError: 'DataFrame' object has no attribute 'str'
Run Code Online (Sandbox Code Playgroud)

解决办法是选择MultiIndex通过tuple

df1 = df[~df[('colB', 'a')].str.contains('Example:')] 
print (df1)
         colA colB colC
            a    a    a
0  Example: s   as    2
1          dd  aaa    3
Run Code Online (Sandbox Code Playgroud)

或重新分配回:

df.columns = df.columns.get_level_values(0)
df2 = df[~df['colB'].str.contains('Example:')] 
print (df2)
         colA colB  colC
0  Example: s   as     2
1          dd  aaa     3
Run Code Online (Sandbox Code Playgroud)

或删除第二级:

df.columns = df.columns.droplevel(1)
df2 = df[~df['colB'].str.contains('Example:')] 
print (df2)
         colA colB  colC
0  Example: s   as     2
1          dd  aaa     3
Run Code Online (Sandbox Code Playgroud)