gae*_*cia 7 r frequency dataframe dplyr janitor
我有一个包含样本分类的数据框:
Seq_ID Family Father Mother Sex Role Type
<chr> <dbl> <chr> <chr> <chr> <chr> <chr>
1 SSC02219 11000. 0 0 Male Father Parent
2 SSC02217 11000. 0 0 Female Mother Parent
3 SSC02254 11000. SSC02219 SSC02217 Male Proband Child
4 SSC02220 11000. SSC02219 SSC02217 Female Sibling Child
5 SSC02184 11001. 0 0 Male Father Parent
6 SSC02181 11001. 0 0 Female Mother Parent
7 SSC02178 11001. SSC02184 SSC02181 Male Proband Child
8 SSC03092 11002. 0 0 Male Father Parent
9 SSC03078 11002. 0 0 Female Mother Parent
10 SSC03070 11002. SSC03092 SSC03078 Female Proband Child
Run Code Online (Sandbox Code Playgroud)
目前,从a到b,我必须这样做:
library(tidyverse)
library(janitor)
sample.df %>% tabyl(Role, Sex) %>%
adorn_totals(where=c("row", "col") ) %>%
as.tibble() %>% select(1,4,3,2) %>%
# Part 2
mutate(type=c("parent", "parent", "child", "child", " ")) %>%
inner_join(., group_by(., type) %>%
summarise(total=sum(Total))) %>%
select(5,6,1,2,3,4)
Run Code Online (Sandbox Code Playgroud)
我觉得这是一个非常简单的解决方法.在dplyr中有更直接的方法来完成第二部分吗?
这是一个选项。as.tibble没有必要。当您有很多课程要分配给“父级”或“子级”时,mutatewith更易于管理。不是必需的,因为我们可以使用和来计算. 最后,我喜欢在使用该函数时写下列名,因为这样将来更容易阅读,但是您当然可以使用列索引,只要您确信列索引无论如何都不会改变您可以在管道操作中包含哪些新分析。case_wheninner_joingroup_bymutatetotalselect
library(tidyverse)
library(janitor)
sample.df %>%
tabyl(Role, Sex) %>%
adorn_totals(where=c("row", "col")) %>%
select(Role, Total, Male, Female) %>%
# Part 2
mutate(type = case_when(
Role %in% c("Mother", "Father") ~"parent",
Role %in% c("Proband", "Sibling") ~"child",
TRUE ~" "
)) %>%
group_by(type) %>%
mutate(total = sum(Total)) %>%
ungroup() %>%
select(type, total, Role, Total, Male, Female)
# # A tibble: 5 x 6
# type total Role Total Male Female
# <chr> <dbl> <chr> <dbl> <dbl> <dbl>
# 1 parent 6. Father 3. 3. 0.
# 2 parent 6. Mother 3. 0. 3.
# 3 child 4. Proband 3. 2. 1.
# 4 child 4. Sibling 1. 0. 1.
# 5 " " 10. Total 10. 5. 5.
Run Code Online (Sandbox Code Playgroud)
数据
library(tidyverse)
library(janitor)
sample.df <- read.table(text = "Seq_ID Family Father Mother Sex Role Type
1 SSC02219 11000 0 0 Male Father Parent
2 SSC02217 11000 0 0 Female Mother Parent
3 SSC02254 11000 SSC02219 SSC02217 Male Proband Child
4 SSC02220 11000 SSC02219 SSC02217 Female Sibling Child
5 SSC02184 11001 0 0 Male Father Parent
6 SSC02181 11001 0 0 Female Mother Parent
7 SSC02178 11001 SSC02184 SSC02181 Male Proband Child
8 SSC03092 11002 0 0 Male Father Parent
9 SSC03078 11002 0 0 Female Mother Parent
10 SSC03070 11002 SSC03092 SSC03078 Female Proband Child ",
header = TRUE, stringsAsFactors = FALSE)
sample.df <- as_tibble(sample.df)
Run Code Online (Sandbox Code Playgroud)