在 Pandas 查询中使用变量

Question

在 Pandas 查询中使用变量

我正在尝试像这样查询 Pandas 数据框：

        inv = pd.read_csv(infile)
        inv.columns = ['County','Site','Role','Hostname'] 
        clist = inv.County.unique() # Get list of counties
        for county in clist: # for each county
            csub=inv.query('County == county') # create a county subset
            ... do stuff on subset

Run Code Online (Sandbox Code Playgroud)

但我收到一个错误：

pandas.core.computation.ops.UndefinedVariableError: name 'county' is not defined

Run Code Online (Sandbox Code Playgroud)

我确定这是一个微不足道的错误，但我无法弄清楚。如何将变量传递给查询方法？

Answer 1

pci*_*icz 50

根据文档，您可以使用@以下方法引用变量：

csub = inv.query('County == @county')

Run Code Online (Sandbox Code Playgroud)

Answer 2

Dr.*_*ner 13

格式化字符串函数

我发现了另一个可能有趣的（更通用的）解决方案：format字符串函数（例如，请参阅参考资料6.1.3.2. Format examples）。

xyz = df.query('ColumnName >= {}'.format(VariableName))

Run Code Online (Sandbox Code Playgroud)

被{}替换为VariableName.

f 弦

此外，用户pciunkiewicz在评论中提到了另一个使用 Python 3.6f-strings （2015 年 8 月）中引入的所谓解决方案：

xyz = df.query(f'ColumnName >= {VariableName}')

Run Code Online (Sandbox Code Playgroud)

一个更一般的f-strings例子，取自这里：

>>> name = "Eric"
>>> age = 74
>>> f"Hello, {name}. You are {age}."
'Hello, Eric. You are 74.'

Run Code Online (Sandbox Code Playgroud)

PS：我是Python新手。

注意：简单的字符串替换在生产中可能不安全，请参见此处 https://github.com/pandas-dev/pandas/issues/13521#issuecomment-593900167 (3认同)

归档时间：	6 年，7 月前
查看次数：	14518 次
最近记录：	4 年，11 月前