如果数据帧列值与字典键匹配，检查不同列是否与字典值匹配

Question

如果数据帧列值与字典键匹配，检查不同列是否与字典值匹配

D.L*_*kin 5 dictionary python-2.7 pandas

我有一个包含 2 列感兴趣的数据框。两者都充满了弦。我还有一个映射键值对的字典，它们也是字符串。我使用字典的键按第一列过滤数据帧，仅查找字典中的那些键。

最终目标是查找数据帧的第一列与字典中的键匹配，然后确认第二列的值与字典中的值匹配。

感兴趣的键上的过滤数据框按预期工作，因此我留下了一个包含两列的数据框，其中仅包含字典中存在的列键。过滤后的数据帧可以是从几行到数千行的任意位置，但字典的长度是静态的。

最终输出应该是一个数据帧，其内容显示已过滤数据帧的行，其中第二列的值与字典的值不匹配。

pairs = {'red': 'apple', 'blue': 'blueberry', 'yellow':'banana'}
filtered_data = {'Color':['red', 'blue'], 'Fruit':['appl','blueberry']}
filtered_df = pd.DataFrame(filtered_data)

#so the filtered_df would resemble
Color     Fruit
red       appl
blue      blueberry

for row in filtered_df.iterrows():
   for k,v in pairs.items():
       #Here's where I'd like to check the value of column 1, find it in the dict then if the 
       #values dont match between col 2 in the df and the dict, append the mismatched row to a 
       #new df.
       if row['Color'] == k:
          new_df.append(row).where(row['Fruit'] != v)

Run Code Online (Sandbox Code Playgroud)

我确定我需要第一个 for 循环中的行的索引，但我不确定如何格式化嵌套循环结构的其余部分。

理想情况下，当我new_df在这种情况下导出数据帧时，它将有 1 行，其中颜色列为红色，水果列为 appl，因为它与类似于下面的字典不匹配。

Color   Fruit
red     appl

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mar*_*anD 3

color_fruit = pd.Series((tuple(x) for x in filtered_df.values), index=filtered_df.index)
result = filtered_df[~color_fruit.isin(pairs.items())]

Run Code Online (Sandbox Code Playgroud)

说明：

在第一行中，我们从原始数据帧的列创建一系列具有相同索引的元组（对）。

然后我们用它来过滤原始数据帧，仅选择不 ( ) 满足字典对成员 ( )~条件的行。.isin()(key, value)pairs

归档时间：	5 年，9 月前
查看次数：	9650 次
最近记录：	2 年前