小编Ars*_*nin的帖子

使用每组的pandas计算唯一值

我需要ID在每个domain 数据中计算唯一值

ID, domain
123, 'vk.com'
123, 'vk.com'
123, 'twitter.com'
456, 'vk.com'
456, 'facebook.com'
456, 'vk.com'
456, 'google.com'
789, 'twitter.com'
789, 'vk.com'
Run Code Online (Sandbox Code Playgroud)

我尝试df.groupby(['domain', 'ID']).count() 但我想得到

domain, count
vk.com   3
twitter.com   2
facebook.com   1
google.com   1
Run Code Online (Sandbox Code Playgroud)

python group-by unique pandas pandas-groupby

163
推荐指数
4
解决办法
38万
查看次数

使用pandas按日期时间间隔分组

我有数据

data    id  url size    domain  subdomain
13/Jun/2016:06:27:26    30055   https://api.weather.com/v1/geocode/55.740002/37.610001/aggregate.json?apiKey=e45ff1b7c7bda231216c7ab7c33509b8&products=conditionsshort,fcstdaily10short,fcsthourly24short,nowlinks  3929    weather.com api.weather.com
13/Jun/2016:06:27:26    30055   https://api.weather.com/v1/geocode/54.720001/20.469999/aggregate.json?apiKey=e45ff1b7c7bda231216c7ab7c33509b8&products=conditionsshort,fcstdaily10short,fcsthourly24short,nowlinks  3845    weather.com api.weather.com
13/Jun/2016:06:27:27    3845    https://api.weather.com/v1/geocode/54.970001/73.370003/aggregate.json?apiKey=e45ff1b7c7bda231216c7ab7c33509b8&products=conditionsshort,fcstdaily10short,fcsthourly24short,nowlinks  30055   weather.com api.weather.com
13/Jun/2016:06:27:27    30055   https://api.weather.com/v1/geocode/59.919998/30.219999/aggregate.json?apiKey=e45ff1b7c7bda231216c7ab7c33509b8&products=conditionsshort,fcstdaily10short,fcsthourly24short,nowlinks  3914    weather.com api.weather.com
13/Jun/2016:06:27:28    30055   https://facebook.com    4005    facebook.com    facebook.com
Run Code Online (Sandbox Code Playgroud)

我需要用间隔5分钟对它进行分组.欲望输出

 data   id  url size    domain  subdomain
13/Jun/2016:06:27:26    30055   https://api.weather.com/v1/geocode/55.740002/37.610001/aggregate.json?apiKey=e45ff1b7c7bda231216c7ab7c33509b8&products=conditionsshort,fcstdaily10short,fcsthourly24short,nowlinks  3929    weather.com api.weather.com
13/Jun/2016:06:27:27    3845    https://api.weather.com/v1/geocode/54.970001/73.370003/aggregate.json?apiKey=e45ff1b7c7bda231216c7ab7c33509b8&products=conditionsshort,fcstdaily10short,fcsthourly24short,nowlinks  30055   weather.com api.weather.com
13/Jun/2016:06:27:28    30055   https://facebook.com    4005    facebook.com    facebook.com
Run Code Online (Sandbox Code Playgroud)

我需要groupby id, subdomain并建立5min 我尝试使用的间隔

print df.groupby([df['data'],pd.TimeGrouper(freq='Min')])
Run Code Online (Sandbox Code Playgroud)

先用分钟分组,但它会返回 TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but …

python datetime pandas

5
推荐指数
1
解决办法
1735
查看次数

如何将图添加到子图 matplotlib

我有这样的情节

fig = plt.figure()
desire_salary = (df[(df['inc'] <= int(salary_people))])
print desire_salary
# Create the pivot_table
result = desire_salary.pivot_table('city', 'cult', aggfunc='count')

# plot it in a separate step. this returns the matplotlib axes
ax = result.plot(kind='bar', alpha=0.75, rot=0, label="Presence / Absence of cultural centre")

ax.set_xlabel("Cultural centre")
ax.set_ylabel("Frequency")
ax.set_title('The relationship between the wage level and the presence of the cultural center')
plt.show()
Run Code Online (Sandbox Code Playgroud)

我想将此添加到subplot. 我试试

fig, ax = plt.subplots(2, 3)
...
ax = result.add_subplot()
Run Code Online (Sandbox Code Playgroud)

但它返回 AttributeError: 'Series' object has no attribute …

python matplotlib pandas

3
推荐指数
1
解决办法
2万
查看次数