我在R中有一个数据框,如下所示:
Genes snps X0 X1 X2 X3
2 WASH7P 1_14677 0 2 2 2
3 WASH7P 1_14684 0 1 2 0
4 WASH7P 1_14685 0 0 0 0
Run Code Online (Sandbox Code Playgroud)
是否有可能进行条件替换,如果int 2的列X0-X3的频率> 0.5,则将0替换为2,将2替换为0?这样新的数据帧是:
Genes snps X0 X1 X2 X3
2 WASH7P 1_14677 2 0 0 0
3 WASH7P 1_14684 0 1 2 0
4 WASH7P 1_14685 0 0 0 0
Run Code Online (Sandbox Code Playgroud)
提前致谢!
使用R,我们可以为以'X'('i1')开头的列名创建索引.然后,我们根据rowMeans"X"列中2的值大于0.5 的条件得到行索引.我们根据行/列索引对'df1'进行子集化,循环遍历列(lapply(...),然后使用recodefrom 将'2'替换为'0',将'0'替换为'2' library(car).将输出分配回'df1'的行/列子集.
library(car)
i1 <- grep('^X', names(df1))
i2 <- rowMeans(df1[i1]==2)> 0.5
df1[i1][i2,] <- lapply(df1[i1][i2,], recode, '2=0;0=2')
df1
# Genes snps X0 X1 X2 X3
#2 WASH7P 1_14677 2 0 0 0
#3 WASH7P 1_14684 0 1 2 0
#4 WASH7P 1_14685 0 0 0 0
Run Code Online (Sandbox Code Playgroud)
df1 <- structure(list(Genes = c("WASH7P", "WASH7P", "WASH7P"),
snps = c("1_14677",
"1_14684", "1_14685"), X0 = c(0L, 0L, 0L), X1 = c(2L, 1L, 0L),
X2 = c(2L, 2L, 0L), X3 = c(2L, 0L, 0L)), .Names = c("Genes",
"snps", "X0", "X1", "X2", "X3"), class = "data.frame",
row.names = c("2", "3", "4"))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
121 次 |
| 最近记录: |