熊猫根据列的函数选择行

Ale*_*ins 4 python pandas

我正在努力学习熊猫。我找到了几个关于如何构建 Pandas 数据框以及如何添加列的示例,它们运行良好。我想学习根据列的值选择所有行。如果列的值应该小于或大于某个数字,我已经找到了多个关于如何执行选择的示例,这也有效。我的问题是如何进行更一般的选择,我想首先计算列的函数,然后选择函数值大于或小于某个数字的所有行

import names
import numpy as np
import pandas as pd
from datetime import date
import random

def randomBirthday(startyear, endyear):
    T1 = date.today().replace(day=1, month=1, year=startyear).toordinal()
    T2 = date.today().replace(day=1, month=1, year=endyear).toordinal()
    return date.fromordinal(random.randint(T1, T2))

def age(birthday):
    today = date.today()
    return today.year - birthday.year - ((today.month, today.day) < (birthday.month, birthday.day))

N_PEOPLE = 20
dict_people = { }
dict_people['gender'] = np.array(['male','female'])[np.random.randint(0, 2, N_PEOPLE)]
dict_people['names'] = [names.get_full_name(gender=g) for g in dict_people['gender']]

peopleFrame = pd.DataFrame(dict_people)

# Example 1: Add new columns to the data frame
peopleFrame['birthday'] = [randomBirthday(1920, 2020) for i in range(N_PEOPLE)]

# Example 2: Select all people with a certain age
peopleFrame.loc[age(peopleFrame['birthday']) >= 20]
Run Code Online (Sandbox Code Playgroud)

除了最后一行之外,此代码有效。请建议写这一行的正确方法是什么。我已经考虑添加一个额外的列,其中包含函数 age 的值,然后根据它的值进行选择。那行得通。但我想知道我是否必须这样做。如果我不想存储一个人的年龄,只用它来做选择怎么办

jez*_*ael 6

使用Series.apply

peopleFrame.loc[peopleFrame['birthday'].apply(age) >= 20]
Run Code Online (Sandbox Code Playgroud)