小编Dal*_*i71的帖子

ggplot2:facet_wrap基于数据集中变量的条带颜色

有没有办法根据随数据框提供的变量填充使用facet_wrap创建的facet条?

示例数据:

MYdata <- data.frame(fruit = rep(c("apple", "orange", "plum", "banana", "pear", "grape")), farm = rep(c(0,1,3,6,9,12), each=6), weight = rnorm(36, 10000, 2500), size=rep(c("small", "large")))

示例图:

p1 = ggplot(data = MYdata, aes(x = farm, y = weight)) + geom_jitter(position = position_jitter(width = 0.3), aes(color = factor(farm)), size = 2.5, alpha = 1) + facet_wrap(~fruit)

我知道如何更改条带的背景颜色(例如橙色):

p1 + theme(strip.background = element_rect(fill="orange"))

facet_wrap和橙色条纹颜色

有没有办法转嫁值的变量sizeMYdata的参数fillelement_rect

基本上,对于所有条带而不是1种颜色,我希望小水果(苹果,李子,梨)的条带背景颜色为绿色,大果实(橙色,香蕉,葡萄)的背景颜色为红色.

r ggplot2 facet-wrap

52
推荐指数
4
解决办法
3万
查看次数

将颜色添加到boxplot - "提供给离散比例的连续值"错误

对我的问题可能有一个非常简单的解决方案,但我无法在网上找到满意的答案.

使用以下命令,我能够创建以下boxplot图并将其与各个数据点重叠:

ggplot(data = MYdata, aes(x = Age, y = Richness)) + 
  geom_boxplot(aes(group=Age)) + 
  geom_point(aes(color = Age))
Run Code Online (Sandbox Code Playgroud)

有几件事我想添加/更改:

1.使用从左到右的6种不同颜色更改每个箱图的线条颜色和/或填充(取决于"年龄"):

c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00")
Run Code Online (Sandbox Code Playgroud)

我试过了

ggplot(data = MYdata, aes(Age, Richness)) + 
  geom_boxplot(aes(group=Age)) + 
  scale_colour_manual(values = c("#E69F00", "#56B4E9", "#009E73", 
                                 "#F0E442", "#0072B2", "#D55E00")) 
Run Code Online (Sandbox Code Playgroud)

但它会导致"Continuous value supplied to discrete scale"错误.

2.使用从左到右的6种不同颜色更改每个数据点的颜色(取决于"年龄"):

c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00")
Run Code Online (Sandbox Code Playgroud)

我试过了:

ggplot(data = MYdata, aes(Age, Richness)) + 
  geom_boxplot(aes(group=Age)) + 
  geom_point(aes(color = Age)) + 
  scale_colour_manual(values = c("#E69F00", "#56B4E9", "#009E73", 
                                 "#F0E442", "#0072B2", "#D55E00")) …
Run Code Online (Sandbox Code Playgroud)

r colors ggplot2 boxplot

38
推荐指数
1
解决办法
7万
查看次数

基于第二数据帧中的值过滤数据帧

我有2个数据框:

at1 = data.frame(ID = c("A", "B", "C", "D", "E"), Sample1 = rnorm(5, 50000, 2500),
      Sample2 = rnorm(5, 50000, 2500), Sample3 = rnorm(5, 50000, 2500),
      row.names = "ID")

  Sample1  Sample2  Sample3
A 52626.55 51924.51 50919.90
B 51430.51 49100.38 51005.92
C 50038.27 52254.73 50014.78
D 48644.46 53926.53 51590.05
E 46462.01 45097.48 50963.39

bt1 = data.frame(ID = c("A", "B", "C", "D", "E"), Sample1 = c(0,1,1,1,1),
      Sample2 = c(0,0,0,1,0), Sample3 = c(1,0,1,1,0), 
      row.names = "ID")

   Sample1 Sample2 Sample3
A       0       0       1
B …
Run Code Online (Sandbox Code Playgroud)

r subset dataframe

9
推荐指数
3
解决办法
1622
查看次数

从vegan包绘制ordiellipse函数到ggplot2中创建的NMDS图

而不是我ggplot2用来创建NMDS图的正常绘图功能.我想在使用功能的NMDS阴谋显示组ordiellipse()vegan包.

示例数据:

library(vegan)
library(ggplot2)
data(dune)
# calculate distance for NMDS
sol <- metaMDS(dune)
# Create meta data for grouping
MyMeta = data.frame(
  sites = c(2,13,4,16,6,1,8,5,17,15,10,11,9,18,3,20,14,19,12,7),
  amt = c("hi", "hi", "hi", "md", "lo", "hi", "hi", "lo", "md", "md", "lo", 
          "lo", "hi", "lo", "hi", "md", "md", "lo", "hi", "lo"),
  row.names = "sites")
# plot NMDS using basic plot function and color points by "amt" from MyMeta
plot(sol$points, col = MyMeta$amt)
# draw dispersion ellipses around data …
Run Code Online (Sandbox Code Playgroud)

r ggplot2 vegan

8
推荐指数
1
解决办法
2万
查看次数

回归数据集的子集

我想做以下事情并需要一些帮助:

分别计算"年龄"[lm(高度〜年龄)]的"高度"的斜率和截距

(A)每个人

(B)性别

并创建一个包含结果(斜率和截距)的表.我可以使用"申请"吗?

在下一步中,我想做一个统计测试,以确定性别之间的斜率和截距是否有显着差异.我知道如何在R中进行测试,但也许有一种方法可以将斜率/截距计算和T检验结合起来.

示例数据:

example = data.frame(Age = c(1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12,
                             1, 3, 6, 9, 12), 
                Individual = c("Jack", "Jack", "Jack", "Jack", "Jack",
                               "Jill", "Jill", "Jill", "Jill", "Jill",
                               "Tony", "Tony", "Tony", "Tony", "Tony",
                               "Jen", "Jen", "Jen", "Jen","Jen"),
                    Gender = c("M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F",
                               "M", "M", "M", "M", "M",
                               "F", "F", "F", "F", "F"),
                    Height = c(38, 62, 92, 119, …
Run Code Online (Sandbox Code Playgroud)

r linear-regression

2
推荐指数
1
解决办法
1629
查看次数

如何计算列中未知字符串的出现次数?

我有另一个问题.感谢大家对R新手的帮助和耐心!

如何计算列中出现字符串的次数?例:

MYdata <- data.frame(fruits = c("apples", "pears", "unknown_f", "unknown_f", "unknown_f"), 
                     veggies = c("beans", "carrots", "carrots", "unknown_v", "unknown_v"), 
                     sales = rnorm(5, 10000, 2500))
Run Code Online (Sandbox Code Playgroud)

问题是我的真实数据集包含几千行和几百个未知的水果和未知的蔬菜.我玩"桌子()"和"水平",但没有太大的成功.我想这比那更复杂.很棒的是有一个输出表,列出每个独特水果/蔬菜的名称以及它在列中出现的次数.任何正确方向的提示都将非常受欢迎.

谢谢,

马库斯

string r count

1
推荐指数
1
解决办法
7811
查看次数