oza*_*arm 6 python random dictionary dataframe pandas
我是Pandas的新手,我想玩随机文本数据.我正在尝试向DataFrame df添加2个新列,每个列都由从字典中随机选择的键(newcol1)+值(newcol2)填充.
countries = {'Africa':'Ghana','Europe':'France','Europe':'Greece','Asia':'Vietnam','Europe':'Lithuania'}
Run Code Online (Sandbox Code Playgroud)
我的df已经有2列,我想要这样的东西:
Year Approved Continent Country
0 2016 Yes Africa Ghana
1 2016 Yes Europe Lithuania
2 2017 No Europe Greece
Run Code Online (Sandbox Code Playgroud)
我当然可以使用for或while循环来填充df ['Continent']和df ['Country'],但我感觉.apply()和np.random.choice可以为此提供更简单,更宽松的解决方案.
是的,你是对的.你可以使用np.random.choice+ map:
df
Year Approved
0 2016 Yes
1 2016 Yes
2 2017 No
df['Continent'] = np.random.choice(list(countries), len(df))
df['Country'] = df['Continent'].map(countries)
df
Year Approved Continent Country
0 2016 Yes Africa Ghana
1 2016 Yes Asia Vietnam
2 2017 No Europe Lithuania
Run Code Online (Sandbox Code Playgroud)
您可以len(df)从country键列表中随机选择键数,然后使用country字典作为映射器来查找以前选择的键的国家/地区等效项.
对于第二步更换,pd.Series.replace也有效:
df['Country'] = df.Continent.replace(countries)
df
Year Approved Continent Country
0 2016 Yes Africa Ghana
1 2016 Yes Asia Vietnam
2 2017 No Europe Lithuania
Run Code Online (Sandbox Code Playgroud)
为了完整起见,您还可以使用apply+ dict.get:
df['Country'] = df.Continent.apply(countries.get)
df
Year Approved Continent Country
0 2016 Yes Africa Ghana
1 2016 Yes Asia Vietnam
2 2017 No Europe Lithuania
Run Code Online (Sandbox Code Playgroud)