我想输出一个类似于本页(右侧)所示的图表,使用R和任何可以使它看起来很好的包:
http://processtrends.com/pg_charts_monthly_cycle_chart.htm
有谁接受挑战?:)
谢谢!
在上一个问题中,我询问基本R中是否存在方便的包装器以将数字格式化为百分比.
这引起了三个回应:
sprintf,可以高度灵活地格式化数字.不过,在我看来,这个sprintf函数对于R初学者来说只是有点过于模糊(除非它们来自C背景).也许更好的解决方案是修改format或prettyNum添加前缀和后缀的选项,这样您就可以轻松创建百分比,货币,度数等.
题:
您将如何设计一个函数,类或一组函数来优雅地处理格式数字作为百分比,货币,度数等?
我在lubridate包中偶然发现了一个奇怪的行为:dmy(NA)拖出一个错误,而不仅仅是返回一个NA.当我想转换一个包含一些元素为NA的列和一些通常转换没有问题的日期字符串时,这会导致我出现问题.
这是最小的例子:
library(lubridate)
df <- data.frame(ID=letters[1:5],
Datum=c("01.01.1990", NA, "11.01.1990", NA, "01.02.1990"))
df_copy <- df
#Question 1: Why does dmy(NA) not return NA, but throws an error?
df$Datum <- dmy(df$Datum)
Error in function (..., sep = " ", collapse = NULL) : invalid separator
df <- df_copy
#Question 2: What's a work around?
#1. Idea: Only convert those elements that are not NAs
#RHS works, but assigning that to the LHS doesn't work (Most likely problem::
#column …Run Code Online (Sandbox Code Playgroud) 我期待计算方程的不定积分.
我有来自加速计的数据通过一个可视化的C程序馈入R,并从那里它是足够简单拿出一个方程来表示加速曲线.这一切都很好,但我也需要计算冲击速度.从好醇"天高中我的理解,我的加速曲线的不定积分将产生的速度方程.
我知道这是很容易与执行数值积分integrate()功能,还有什么是可以与之相比的不定积分?
有人可以帮助我完成解决这个问题的一般方法吗?
我正在尝试使用我自己的一些火车运动数据来复制像这样的(全尺寸)火车图.

该图表看起来像......
我的数据看起来像这样......

谢谢您的帮助 :)
样本可以像这样再现......
dat <- structure(list(id = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
5L, 5L, 5L, 5L, 5L, 5L, …Run Code Online (Sandbox Code Playgroud) 我一直在尝试使用ggstructure绘制热图.如果我能得到一些指示,我不介意使用基本图形.我希望热图看起来像这样:
http://www.vistadatavision.com/uploads/images/reports/intensity_plot_1.PNG
我没有设法进一步输入ggstructure(df3),然后在错误中摸不着头脑 - 这似乎是因为它不喜欢日期/时间数据?
这是两周数据样本的"输入",间隔为10分钟.编辑:我最后只有4天的空间.我希望它就够了.
任何指针都非常赞赏.
structure(list(date = c("2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01", "2010-01-01",
"2010-01-01", "2010-01-01", "2010-01-01", …Run Code Online (Sandbox Code Playgroud) 我一直在经历提供这些示例此页面,但由于某些原因无法找到这样做的正确方法.
我有一些这样的数据:
Group Member Percentage
[1,] "1" "A" "60"
[2,] "1" "A" "20"
[3,] "1" "A" "20"
[4,] "1" "B" "80"
[5,] "1" "B" "5"
[6,] "1" "B" "5"
[7,] "1" "B" "5"
[8,] "2" "C" "50"
[9,] "2" "C" "50"
[10,] "2" "D" "25"
[11,] "2" "D" "25"
[12,] "2" "D" "25"
[13,] "2" "D" "20"
[14,] "2" "D" "5"
Run Code Online (Sandbox Code Playgroud)
并可以使用以下命令创建:
a = c(1,1,1,1,1,1,1,2,2,2,2,2,2,2)
b = c("A","A","A","B","B","B","B","C","C","D","D","D","D","D")
c = c(60,20,20,80,5,5,5,50,50,25,25,25,20,5)
dat = data.frame(Group=a, Member=b, Percentage=c)
ggplot(dat, aes(x=Member, …Run Code Online (Sandbox Code Playgroud) 另一个ggplot传奇问题!
我有一个表格的数据集
test <- data.frame(
cond = factor(rep(c("A", "B"), each=200)),
value = c(rnorm(200), rnorm(200, mean=0.8))
)
Run Code Online (Sandbox Code Playgroud)
所以两组和一些值我想绘制密度.我还想在剧情中添加一行表示每组的平均值,所以我:
test.cdf <- ddply(test, .(cond), summarise, value.mean=mean(value))
Run Code Online (Sandbox Code Playgroud)
然后在ggplot调用:
ggplot(test, aes(value, fill=cond)) +
geom_density(alpha=0.5) +
labs(x='Energy', y='Density', fill='Group') +
opts(
panel.background=theme_blank(),
panel.grid.major=theme_blank(),
panel.grid.minor=theme_blank(),
panel.border=theme_blank(),
axis.line=theme_segment()
) +
geom_vline(data=test.cdf, aes(xintercept=value.mean, colour=cond),
linetype='dashed', size=1)
Run Code Online (Sandbox Code Playgroud)
如果运行上面的代码,则会得到一个表示每个组的图例,但也会显示一个表示平均指示符vline的图例.我的问题是如何摆脱传说geom_vline()?
我比较了内置R函数的性能rnorm,qnorm以及pnorm等效的Matlab函数.
似乎rnorm和pnorm函数在R中比在Matlab中慢3-6倍,而qnorm函数是ca. 40%的R.更快我试图RCPP包通过使用这导致减少在运行期间由〜30%,其仍然比显著Matlab的较慢相应的C库来加快R的功能rnorm和pnorm.
是否有可用的包提供了一种更快的方法来模拟R中的正态分布随机变量(除了使用标准rnorm函数)?
我正在创建两个图ggplot2,然后使用grid.arrange它们将它们合并在一起.我应该说这两个图也facet_grid用于视觉调整.

我的问题是,底部的情节,实际上是一个数据表,最终在左侧和右侧被"切断",因为小平面的起始位置和结束位置.有没有办法让我调整一下?我想调整一下,所以积分不会被切断.
以下是重现它的数据:
df <- structure(list(SurveyID = c(16L, 16L, 16L, 16L, 16L, 16L, 16L,
16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L,
16L, 16L, 16L, 16L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L,
26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L, 26L,
26L, 26L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L,
47L, 47L, 47L, 47L, 47L, 47L, 47L, 47L, …Run Code Online (Sandbox Code Playgroud)