我正在使用R中的一个巨大的数据表,其中包含不同来源的多个位置的每月温度测量值.
数据集如下所示:
library(data.table)
# Generate random data:
loc <- 1:10
dates <- seq(as.Date("2000-01-01"), as.Date("2004-12-31"), by="month")
mods <- c("A","B", "C", "D", "E")
temp <- runif(length(loc)*length(dates)*length(mods), min=0, max=30)
df <- data.table(expand.grid(Location=loc,Date=dates,Model=mods),Temperature=temp)
Run Code Online (Sandbox Code Playgroud)
所以基本上,对于位置1,我从模型A到2000年1月到2004年12月进行测量.然后,我对模型B进行了测量.对于模型C,D和E进行测量等等.然后,对于位置2进行测量到位置10.
我需要做的是,不是进行五种不同的温度测量(来自模型),而是采用所有模型的平均温度.
因此,对于每个位置和每个日期,我不会有五个但只有一个温度测量值(这将是一个多模型的平均值).
我试过这个:
df2 <- df[, Mean:=mean(Temperature), by=list(Model, Location, Date)]
Run Code Online (Sandbox Code Playgroud)
这没有按照我的预期工作.我至少期望得到的数据表是原始表的行数的1/5,因为我将五个测量值总结为一个.
我究竟做错了什么?
我有以下数据:
> vec
[1] 0.0 0.5 1.0 1.4 1.9 2.4 3.1 3.6 4.1 4.6 5.0 5.5 6.0 6.5 7.0 7.4 7.9 8.4 9.1
Run Code Online (Sandbox Code Playgroud)
我需要四舍五入到最接近的 0.5。
让我更具体一点:1.4 变成 1.5,1.9 变成 2.0。此外,2.4 变为 2.5,3.1 变为 3.0。等等。我期望的向量是:
> vec
[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0
Run Code Online (Sandbox Code Playgroud)
有任何想法吗?
非常感谢。
我需要使用循环处理一长串图像.运行所有内容需要相当长的时间,因此我想跟踪进度.
这是我的循环:
files.list <- c("LC82210802013322LGN00_B1.TIF", "LC82210802013322LGN00_B10.TIF",
"LC82210802013322LGN00_B11.TIF", "LC82210802013322LGN00_B2.TIF",
"LC82210802013322LGN00_B3.TIF", "LC82210802013322LGN00_B4.TIF",
"LC82210802013322LGN00_B5.TIF", "LC82210802013322LGN00_B6.TIF",
"LC82210802013322LGN00_B7.TIF", "LC82210802013322LGN00_B8.TIF",
"LC82210802013322LGN00_B9.TIF", "LC82210802013322LGN00_BQA.TIF",
"LC82210802013354LGN00_B1.TIF", "LC82210802013354LGN00_B10.TIF",
"LC82210802013354LGN00_B11.TIF", "LC82210802013354LGN00_B2.TIF",
"LC82210802013354LGN00_B3.TIF", "LC82210802013354LGN00_B4.TIF",
"LC82210802013354LGN00_B5.TIF", "LC82210802013354LGN00_B6.TIF",
"LC82210802013354LGN00_B7.TIF", "LC82210802013354LGN00_B8.TIF",
"LC82210802013354LGN00_B9.TIF", "LC82210802013354LGN00_BQA.TIF",
"LC82210802014021LGN00_B1.TIF", "LC82210802014021LGN00_B10.TIF",
"LC82210802014021LGN00_B11.TIF", "LC82210802014021LGN00_B2.TIF",
"LC82210802014021LGN00_B3.TIF", "LC82210802014021LGN00_B4.TIF",
"LC82210802014021LGN00_B5.TIF", "LC82210802014021LGN00_B6.TIF",
"LC82210802014021LGN00_B7.TIF", "LC82210802014021LGN00_B8.TIF",
"LC82210802014021LGN00_B9.TIF", "LC82210802014021LGN00_BQA.TIF",
"LC82210802014037LGN00_B1.TIF", "LC82210802014037LGN00_B10.TIF",
"LC82210802014037LGN00_B11.TIF", "LC82210802014037LGN00_B2.TIF",
"LC82210802014037LGN00_B3.TIF", "LC82210802014037LGN00_B4.TIF",
"LC82210802014037LGN00_B5.TIF", "LC82210802014037LGN00_B6.TIF",
"LC82210802014037LGN00_B7.TIF", "LC82210802014037LGN00_B8.TIF",
"LC82210802014037LGN00_B9.TIF", "LC82210802014037LGN00_BQA.TIF",
"LC82210802014085LGN00_B1.TIF", "LC82210802014085LGN00_B10.TIF",
"LC82210802014085LGN00_B11.TIF", "LC82210802014085LGN00_B2.TIF",
"LC82210802014085LGN00_B3.TIF", "LC82210802014085LGN00_B4.TIF",
"LC82210802014085LGN00_B5.TIF", "LC82210802014085LGN00_B6.TIF",
"LC82210802014085LGN00_B7.TIF", "LC82210802014085LGN00_B8.TIF",
"LC82210802014085LGN00_B9.TIF", "LC82210802014085LGN00_BQA.TIF"
)
for (x in files.list) { #loop over files
# Tell about progress
cat('Processing image', x, …Run Code Online (Sandbox Code Playgroud) 我目前在R中的学习目标是避免for循环.我经常需要列出目录中的文件(或循环目录)来对这些文件执行各种操作.
我的任务的一个例子如下:我必须调用一个名为cdo合并两个文件的系统应用程序.这个命令的语法是,比方说:cdo merge input_file1 input_file2 output_file.
我目前的R代码如下所示:
# set lists of files
u.files <- c("uas_Amon_ACCESS1-3.nc", "uas_Amon_CMCC-CESM.nc", "uas_Amon_CMCC-CESM.nc")
v.files <- c("vas_Amon_ACCESS1-3.nc", "vas_Amon_CMCC-CESM.nc", "vas_Amon_CMCC-CESM.nc")
for (i in 1:length(u.files)) {
# set input file 1 to use on cdo
input1 <- paste(u.files[i], sep='')
# set input file 2 to use on cdo
input2 <- paste(v.files[i], sep='')
# set output file to use on cdo
output <- paste('output_', u.files[i], sep='')
# assemble the command string
comm <- paste('cdo …Run Code Online (Sandbox Code Playgroud) 我想在保存到变量的水平图中添加图例的标题。
例如,这段代码可以工作:
library(lattice)
library(grid)
x = 1:10
y = rep(x,rep(10,10))
x = rep(x,rep(10))
z = x+y
levelplot(z~x*y, colorkey=list(labels=list(cex=1,font=2,col="brown"),height=1,width=1.4),main=list('b',side=1,line=0.5))
trellis.focus("legend", side="right", clipp.off=TRUE, highlight=FALSE)
grid.text(expression(m^3/m^3), 0.2, 0, hjust=0.5, vjust=1)
trellis.unfocus()
Run Code Online (Sandbox Code Playgroud)
但是这段代码(将相同的图保存为变量)不起作用:
p1 <- levelplot(z~x*y, colorkey=list(labels=list(cex=1,font=2,col="brown"),height=1,width=1.4),main=list('b',side=1,line=0.5))
trellis.focus("legend", side="right", clipp.off=TRUE, highlight=FALSE)
grid.text(expression(m^3/m^3), 0.2, 0, hjust=0.5, vjust=1)
trellis.unfocus()
Run Code Online (Sandbox Code Playgroud)
我怎样才能实现这个目标?
我正在尝试使用ggplot2绘制与geom_point结合的水平误差线.由于数据对重叠很多并且使得图表难以阅读,我想躲避它们.请看下面的例子:
DF = structure(list(co2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), .Label = "dynamic", class = "factor"), exp = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("co2-only",
"co2+clim"), class = "factor"), scen = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("RCP4.5", "RCP8.5"
), class = "factor"), period = structure(c(3L, 2L, 1L, 3L, 2L,
1L, 3L, 2L, 1L, 3L, 2L, 1L), …Run Code Online (Sandbox Code Playgroud) 我正在研究作物生长的热需求。我有一个表,其中包含 6 个月时间段内的累积温度。示例如下:
date temp cum_temp
1: 2020-03-01 9.339748 9.339748
2: 2020-03-02 23.860849 33.200597
3: 2020-03-03 12.860331 46.060928
4: 2020-03-04 26.607505 72.668432
5: 2020-03-05 28.273551 100.941984
6: 2020-03-06 2.321138 103.263122
7: 2020-03-07 16.315059 119.578181
8: 2020-03-08 26.880152 146.458334
9: 2020-03-09 16.991615 163.449949
10: 2020-03-10 14.241827 177.691776
11: 2020-03-11 28.748167 206.439943
12: 2020-03-12 14.146691 220.586634
13: 2020-03-13 20.649548 241.236182
14: 2020-03-14 17.606369 258.842551
15: 2020-03-15 3.984816 262.827367
Run Code Online (Sandbox Code Playgroud)
然后,我还有一个表格,其中列出了作物生长阶段及其热需求(即达到每个阶段所需的热阈值):
growth_stage thermal_req
1: VE 120
2: V2 200
3: V3 350
4: V5-V6 …Run Code Online (Sandbox Code Playgroud) 我试图列出按如下方式组织的文件:
/Volumes/Macintosh HD 2/data/cmip5/historical/
----clt
-----------------------file1.txt
-----------------------file2.txt
---------------models
-----------------------file1.txt
-----------------------file2.txt
----hurs
-----------------------file1.txt
-----------------------file2.txt
---------------models
-----------------------file1.txt
-----------------------file2.txt
----precip
-----------------------file1.txt
-----------------------file2.txt
---------------models
-----------------------file1.txt
-----------------------file2.txt
----temp
-----------------------file1.txt
-----------------------file2.txt
---------------models
-----------------------file1.txt
-----------------------file2.txt
----wind
-----------------------file1.txt
-----------------------file2.txt
---------------models
-----------------------file1.txt
-----------------------file2.txt
Run Code Online (Sandbox Code Playgroud)
我想做的是在一个列表中列出子目录“models”中包含的所有文件。
我尝试过但没有成功的是这个命令:
> Sys.glob(file.path('/Volumes/Macintosh HD 2/data/cmip5/historical/', "models","*.txt"))
character(0)
Run Code Online (Sandbox Code Playgroud)
有没有直接的方法可以用 R 实现这一点?
我有一个嵌套列表,名称mylist长度为4.
该列表中的每个元素是一个实验:exp1.1,exp1.2,exp2.1和exp2.2.
每个实验包含四个植物生长阶段的长度(以天为单位)的观察结果:EM-V6 V6-R0 R0-R4和R4-R9.
每个增长阶段都被组织为一个带year和的数据框架mean.
这是完整的数据:
mylist=structure(list(exp1.1 = structure(list(`EM-V6` = structure(list(
year = 2011:2100, mean = c(34, 34, 32, 28, 25, 32, 32, 28,
27, 30, 32, 31, 33, 28, 26, 31, 33, 27, 34, 26, 28, 27, 27,
30, 29, 31, 34, 30, 26, 31, 33, 33, 27, 30, 28, 32, 31, 29,
32, 31, 25, 28, 28, 26, 32, …Run Code Online (Sandbox Code Playgroud) 我正在尝试创建一个图表,显示土壤湿度的月垂直剖面图,包括多个站点的观测数据和模拟数据.
到目前为止,我只能绘制一组值,无论是观察值还是建模值,如下例所示:
library(ggplot2)
library(RColorBrewer)
# Create customized color palette
mypal <- colorRampPalette(brewer.pal(6,"PuBu"))
ggplot(df1, aes(x=value, y=depth, colour=as.factor(month))) +
geom_path() +
facet_wrap(~ site) +
scale_y_reverse() +
scale_colour_manual(values=mypal(12)) +
theme_bw(base_size=18) +
ylab("Depth") + xlab(bquote('Soil moisture (' ~m^3~m^-3*')')) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
Run Code Online (Sandbox Code Playgroud)
以下是重现它的数据:
df1 <- structure(list(site = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, …Run Code Online (Sandbox Code Playgroud)