我有以下格式的每小时天气数据:
Date,DBT
01/01/2000 01:00,30
01/01/2000 02:00,31
01/01/2000 03:00,33
...
...
12/31/2000 23:00,25
Run Code Online (Sandbox Code Playgroud)
我需要的是每日聚合最大值,最小值,如此:
Date,MaxDBT,MinDBT,AveDBT
01/01/2000,36,23,28
01/02/2000,34,22,29
01/03/2000,32,25,30
...
...
12/31/2000,35,9,20
Run Code Online (Sandbox Code Playgroud)
在R中如何做到这一点?
继我之前关于将每小时数据汇总到每日数据的问题之后,我想继续(a)每月汇总和(b)将每月汇总合并到原始数据帧中.
我的原始数据框如下所示:
Lines <- "Date,Outdoor,Indoor
01/01/2000 01:00,30,25
01/01/2000 02:00,31,26
01/01/2000 03:00,33,24
02/01/2000 01:00,29,25
02/01/2000 02:00,27,26
02/01/2000 03:00,39,24
12/01/2000 02:00,27,26
12/01/2000 03:00,39,24
12/31/2000 23:00,28,25"
Run Code Online (Sandbox Code Playgroud)
在我之前的问题中已经回答了每日聚合,然后我可以找到从那里生成每月聚合的方法,如下所示:
Lines <- "Date,Month,OutdoorAVE
01/01/2000,Jan,31.33
02/01/2000,Feb,31.67
12/01/2000,Dec,31.33"
Run Code Online (Sandbox Code Playgroud)
其中OutdoorAVE是每日最低和最高室外温度的月平均值.我最终想要的是这样的:
Lines <- "Date,Outdoor,Indoor,Month,OutdoorAVE
01/01/2000 01:00,30,25,Jan,31.33
01/01/2000 02:00,31,26,Jan,31.33
01/01/2000 03:00,33,24,Jan,31.33
02/01/2000 01:00,29,25,Feb,31.67
02/01/2000 02:00,27,26,Feb,31.67
02/01/2000 03:00,39,24,Feb,31.67
12/01/2000 02:00,27,26,Dec,31.33
12/01/2000 03:00,39,24,Dec,31.33
12/31/2000 23:00,28,25,Dec,31.33"
Run Code Online (Sandbox Code Playgroud)
我不知道如何做到这一点.任何帮助是极大的赞赏.
我想将每个偶数行与其上方的行组合.就像是:
Line one,csv,csv,csv Line two,csv,csv Line three,csv,csv,csv,csv Line four,csv
结果应如下所示:
Line one,csv,csv,csv,Line two,csv,csv Line three,csv,csv,csv,csv,Line four,csv
任何想法如何在Perl或sed/awk中实现?
我有以下数据:
subject = c("S01","S02","S03","S04","S05","S06","S07","S08","S09","S10")
post = c(100,80,75,120,85,90,95,90,110,100)
pre = c(45,60,80,75,45,60,55,50,35,40)
data1 = as.data.frame(cbind(subject, post, pre))
Run Code Online (Sandbox Code Playgroud)
然后我根据post列对数据进行了排序:
data1 = data1[order(data1$post),]
Run Code Online (Sandbox Code Playgroud)
我最终想要的是一个散点图,比较post和pre列,相应地以不同的颜色.X轴只是数据框的索引,但标有主题编号,因此轴标签将按主题编号的顺序排列,因为数据框按帖子列排序
如果我这样做:
plot(data1$post)
Run Code Online (Sandbox Code Playgroud)
我所拥有的是条形图,甚至不是散点图.帖子栏的这个原因是一个因素吗?我为post和pre列尝试了"as.numeric",但结果是一样的
如果我这样做:
plot(data1$post,data1$pre)
Run Code Online (Sandbox Code Playgroud)
我有一个散点图,但索引从1到20.因此,我没有在同一索引1到10上进行比较分散,而是有两个分散,索引从1-10到11-20.
任何帮助指出我的错误将不胜感激.
我对动物园中的列名有疑问.我通常从数据框创建zoo对象,然后从数据框中选择列作为zoo列.我发现,如果我只为zoo对象指定一列,那么动物园将不会使用列名.这是否意味着它不被视为动物园中的"专栏"?
以下是我通常使用一列和两列的示例.
Lines.1 = "Index,dbt
2008-08-20 15:03:18,88.74
2008-08-20 15:08:18,88.74
2008-08-20 15:13:18,86.56
2008-08-20 15:18:18,85.82"
Lines.2 = "Index,dbt,rh
2008-08-20 15:03:18,88.74,18.25
2008-08-20 15:08:18,88.74,17.25
2008-08-20 15:13:18,86.56,18.75
2008-08-20 15:18:18,85.82,19.75"
x =read.table(text = Lines.1, header = TRUE, sep = ",")
y =read.table(text = Lines.2, header = TRUE, sep = ",")
colnames(x)
colnames(y)
library(zoo)
zx = zoo(x[,2], as.POSIXct(x$Index, tz="GMT"))
zy = zoo(y[,2:3], as.POSIXct(y$Index, tz="GMT"))
colnames(zx)
colnames(zy)
Run Code Online (Sandbox Code Playgroud)
结果如下:
> colnames(zx)
NULL
> colnames(zy)
[1] "dbt" "rh"
Run Code Online (Sandbox Code Playgroud)
我错过了什么吗?
我试图对我的数据框的所有列(一次两个)进行t检验,并仅提取p值.以下是我的想法:
for (i in c(5:525) ) {
t_test_p.value =sapply( Data[5:525], function(x) t.test(Data[,i],x, na.rm=TRUE)$p.value)
}
Run Code Online (Sandbox Code Playgroud)
我的问题是:1.有没有办法在没有循环的情况下做到这一点?2.如何捕获t检验的结果.
我有一个类似的数据,然后使用晶格生成一个boxplot:
mydata <- data.frame(Y = rnorm(3*1000),
INDFACT =rep(c("A", "B", "C"), each=1000),
CLUSFACT=factor(rep(c("M","F"), 1500)))
library(lattice)
bwplot(Y ~ INDFACT | CLUSFACT, data=mydata, layout=c(2,1))
Run Code Online (Sandbox Code Playgroud)
我的问题是我希望每个因素A,B和C都有不同的颜色.我试过这个:
bwplot(Y ~ INDFACT | CLUSFACT, data=mydata, layout=c(2,1), col=c("red","blue","green"))
Run Code Online (Sandbox Code Playgroud)
但它只是改变了点的颜色.我想要的是改变整个颜色(点,盒子和伞).
有没有办法做到这一点?
乡亲
我有一个建筑物区域的温度数据,如下所示:
Lines <- "Date,Zone01,Zone02
01/01 01:00:00,24.5,21.3
01/01 02:00:00,24.3,21.1
01/01 03:00:00,24.1,21.1
01/01 04:00:00,24.1,20.9
01/01 05:00:00,25.,21.
01/01 06:00:00,26.,21.
01/01 07:00:00,26.6,22.3
01/01 08:00:00,28.,24.
01/01 09:00:00,28.9,26.5
01/01 10:00:00,29.4,29
01/01 11:00:00,30.,32.
01/01 12:00:00,33.,35.
01/01 13:00:00,33.4,36
01/01 14:00:00,35.8,38
01/01 15:00:00,32.3,37
01/01 16:00:00,30.,34.
01/01 17:00:00,29.,33.
01/01 18:00:00,28.,32.
01/01 19:00:00,26.3,30
01/01 20:00:00,26.,28.
01/01 21:00:00,25.9,25
01/01 22:00:00,25.8,21.3
01/01 23:00:00,25.6,21.4
01/01 24:00:00,25.5,21.5
01/02 01:00:00,25.4,21.6
01/02 02:00:00,25.3,21.8"
Run Code Online (Sandbox Code Playgroud)
我想要做的是计算每个区域的第99百分位的温度.我会做这个命令:
Q=quantile(Lines$Zone01,0.99)
Run Code Online (Sandbox Code Playgroud)
但是,我必须手动为数据集中的每一列执行此操作.有没有办法让这个命令遍历所有列(从第二列开始)?
非常感谢.
伙计们,我有一个像这样的每小时温度数据
Lines <- "Date,Outdoor,Indoor
01/01 01:00:00,24.5,21.3
01/01 02:00:00,24.3,21.1
01/01 03:00:00,24.1,21.1
01/01 04:00:00,24.1,20.9
01/01 05:00:00,25.,21.
01/01 06:00:00,26.,21.
01/01 07:00:00,26.6,22.3
01/01 08:00:00,28.,24.
01/01 09:00:00,28.9,26.5
01/01 10:00:00,29.4,29
01/01 11:00:00,30.,32.
01/01 12:00:00,33.,35.
01/01 13:00:00,33.4,36
01/01 14:00:00,35.8,38
01/01 15:00:00,32.3,37
01/01 16:00:00,30.,34.
01/01 17:00:00,29.,33.
01/01 18:00:00,28.,32.
01/01 19:00:00,26.3,30
01/01 20:00:00,26.,28.
01/01 21:00:00,25.9,25
01/01 22:00:00,25.8,21.3
01/01 23:00:00,25.6,21.4
01/01 24:00:00,25.5,21.5
01/02 01:00:00,25.4,21.6
01/02 02:00:00,25.3,21.8"
Run Code Online (Sandbox Code Playgroud)
我需要创建另一个列,如果室内高于室外高度至少1度,则表示1.
我试过了:
DF$Time = 0
if ((Indoor-Outdoor) >= 1) DF$Time = 1
Run Code Online (Sandbox Code Playgroud)
但上述方法无效.有什么建议吗?
这是R:t-test对所有列的后续问题
假设我有一个庞大的数据集,然后我根据某些条件创建了许多子集.子集应具有相同的列数.然后我想一次对两个子集进行t检验(外循环),然后对于每个子集组合,一次一列地遍历所有列(内循环).
以下是我根据之前的答案提出的建议.这个因错误而停止.
C <- c("c1","c1","c1","c1","c1",
"c2","c2","c2","c2","c2",
"c3","c3","c3","c3","c3",
"c4","c4","c4","c4","c4",
"c5","c5","c5","c5","c5",
"c6","c6","c6","c6","c6",
"c7","c7","c7","c7","c7",
"c8","c8","c8","c8","c8",
"c9","c9","c9","c9","c9",
"c10","c10","c10","c10","c10")
X <- rnorm(n=50, mean = 10, sd = 5)
Y <- rnorm(n=50, mean = 15, sd = 6)
Z <- rnorm(n=50, mean = 20, sd = 5)
Data <- data.frame(C, X, Y, Z)
Data.c1 = subset(Data, C == "c1",select=X:Z)
Data.c2 = subset(Data, C == "c2",select=X:Z)
Data.c3 = subset(Data, C == "c3",select=X:Z)
Data.c4 = subset(Data, C == "c4",select=X:Z)
Data.c5 = subset(Data, C == "c5",select=X:Z) …Run Code Online (Sandbox Code Playgroud)