我正在努力创建一系列高质量的 ggboxplots,如下所示:
对于上面的示例,事后比较的统计数据已通过您可以在此链接页面找到的方式获得,并且我运行了以下代码
#Compute the post-hocs
postHocs <- df %>%
tidyr::pivot_longer(., -c(A, C, D),'s')%>%
mutate(s = fct_relevel(s,
c("E", "F", "G",
"H", "I", "J",
"K", "L", "M",
"N", "O", "P")) %>%
arrange(s) %>%
group_by(s) %>%
pairwise_t_test(
value ~ D, paired = TRUE,
p.adjust.method = "bonferroni"
) %>%
#dplyr::select(., -'s')%>%
print()
Run Code Online (Sandbox Code Playgroud)
同时得到方差分析统计:
res.aov <- df %>%
tidyr::pivot_longer(., -c(A, C, D),'s')%>%
mutate(s = fct_relevel(s,c("E", "F", "G",
"H", "I", "J",
"K", "L", "M",
"N", "O", "P")
)))%>%
arrange(s) …Run Code Online (Sandbox Code Playgroud) 使用 function 获得的 flextable 函数的输出as_grouped_data()。
df = structure(list(variable = c("something", NA, NA, NA), var = c(NA, "(Intercept)", "mutate1", "variable"), estimate = c(NA, 3.64770410229416, -0.230158472032055, -0.000692974348090823), std.error = c(NA, 0.88, 0.0831, 0.9315), statistic = c(NA, 0.1933, -0.5458, -0.613), df = c(NA, 67.03, 53.27, 58.285), p.value = c(NA, "<0.001", "0.80", "0.87")), row.names = c(NA, 4L), class = c("grouped_data", "data.frame"))
Run Code Online (Sandbox Code Playgroud)
这给出了类似的东西:
variable var estimate std.error statistic df p.value
1 something <NA> NA NA NA NA <NA>
2 <NA> (Intercept) 3.6477041023 0.8800 …Run Code Online (Sandbox Code Playgroud) 我试图弄清楚以下每个操作计算从 1 到 100 的累积和的速度有多快/慢。
install.package('microbenchmark')
library(microbenchmark)
#Method 1
cs_for = function(x) {
for (i in x) {
if (i == 1) {
xc = x[i]
} else {
xc = c(xc, sum(x[1:i]))
}
}
xc
}
cs_for(1:100)
# Method 2: with apply (3 lines)
cs_apply = function(x) {
sapply(x, function(x) sum(1:x))
}
cs_apply(100)
# Method 3:
cumsum (1:100)
microbenchmark(cs_for(1:100), cs_apply(100), cumsum(1:100))
Run Code Online (Sandbox Code Playgroud)
我得到的输出如下:
Unit: nanoseconds
expr min lq mean median uq max neval cld
cs_for(1:100) 97702 100551 106129.05 102500.5 105151 …Run Code Online (Sandbox Code Playgroud) 我有这个字符串
D = c("0" , "11", "12", "13", "14", "15", "16", "21", "22", "23", "24", "25", "26", "31", "32", "33", "34", "35", "36", "41", "42", "43", "44","45", "46","51","52", "53", "54", "55", "56", "61", "62", "63", "64", "65", "66")
Run Code Online (Sandbox Code Playgroud)
如果我想删除那些低于 30 的字符而不将字符串转换为数字,我该怎么办?
如果我想根据单元号重新排序字符串(这意味着所有最后一个为 1 的字符串都排在最后两个为最后一个的字符串之前,依此类推,例如 31、41、51、61、32、42 等,这应该是操作?
谢谢
我做了部分相关分析,与ggm包
list = list(mtcars, mtcars)
list = lapply(list, function(x) x %>%
mutate(gear = as.factor(gear)))
library(ggm)
lapply(list, function(x) {
sapply(split(x, x$gear), function(x) {
pcor(u = c('mpg', 'disp', 'hp', 'vs'), S = var(x))
})
})
Run Code Online (Sandbox Code Playgroud)
pcor和包装一起
pcorr1 = list %>%
map(function(x) split(x[c('mpg', 'disp', 'hp', 'vs')], x$gear))
coeff = c("pearson", "spearman")
res = lapply(1:2, function(x) lapply(seq(coeff), function(x) {
lapply(pcorr1[[x]], function(y) pcor(y, method = coeff[[x]]))}))
Run Code Online (Sandbox Code Playgroud)
任何人都可以推荐一种如何使用 ggplot2 计算图中的这种相关性的方法吗?
谢谢
UPFATE 只是为了理解,我想知道是否可以使用相关系数作为 y 和 x 所有级别的分组变量(它应该是一种条形图)
我想生成两个不定式系列 0 和 1,具体按以下顺序:
0, 1, 0, -1, 0, 1, 0, -1, ...
我创建了以下代码,它不返回除以下内容之外的内容:
# for in loop
for i in itertools.cycle(range(0,2)):
if i == 0:
i += 1
if i == 1:
i -= 1
if i == 0:
i -= 1
print(i, end = " ")
Run Code Online (Sandbox Code Playgroud)
它只是返回一系列-1。无法弄清楚错误在哪里。任何人都可以提出任何建议