在R中获得一定天数的平均值/平均值

ben*_*_14 2 r dplyr

让我们说我有这个数据框架

Date           DayOfWeek    Url    Hits
09/01/2016     Thursday     url1   3
09/01/2016     Thursday     url2   5
09/01/2016     Thursday     url3   4
09/02/2016     Friday       url1   7
09/02/2016     Friday       url3   6
09/03/2016     Saturday     url2   9
09/03/2016     Saturday     url1   5
09/04/2016     Sunday       url2   6
09/07/2016     Wednesday    url10  4
09/07/2016     Thursday     url2   3
09/07/2016     Thursday     url4   2
09/07/2016     Thursday     url5   3
09/07/2016     Thursday     url1   3
09/08/2016     Friday     url1   3
09/08/2016     Friday     url4   3
09/08/2016     Friday     url5   2
09/08/2016     Friday     url8   6
09/09/2016     Saturday     url2   1
09/09/2016     Saturday     url3   2
09/09/2016     Saturday     url5   4
09/09/2016     Saturday     url1   8
09/14/2016     Thursday     url1   3
09/147/2016     Thursday     url2   2
09/14/2016     Thursday     url3   3
Run Code Online (Sandbox Code Playgroud)

我希望在访问的唯一网址数量方面获得本周最忙碌的一天.例如,在数据框中有3个星期四,第一个星期四有3个唯一的网址访问,第二个星期四有4个,最后一个星期四有3个......我打算做的是,总和网址= 3 + 4 + 3 /(周四的数量= 3)=这一天的大量独特网址....

对于星期五,第一个将是2个网址,然后是第二个,有4个,计算将是2 + 4 /数据集中的星期五数量= 2

我正试图通过dplyr来解决这个问题.我正在尝试使用group_by,但我似乎无法确定正确的功能组合以达到我需要的效果.

akr*_*run 6

我们得到每个'Date'和'DayOfWeek'(n_distinct)的不同'Url'('N')的数量,并获得mean每个'DayofWeek'的'N'.

library(dplyr)
df1 %>% 
    group_by(Date, DayOfWeek) %>%
    summarise(N = n_distinct(Url)) %>% 
    group_by(DayOfWeek) %>% 
    summarise(N = mean(N))
# DayOfWeek        N
#      <chr>    <dbl>
#1    Friday 3.000000
#2  Saturday 3.000000
#3    Sunday 1.000000
#4  Thursday 3.333333
#5 Wednesday 1.000000
Run Code Online (Sandbox Code Playgroud)