我想创建一个空列表,以便我可以用其他列表替换它的元素.
例如
simulations = 10
seeds = sample(10000:99999, simulations, replace=F)
test_function <- function(seedX){
lista = list(seedX=seedX,
dataframe=data.frame(x=runif(10, -5.0, 5.0),y=rnorm(10,mean = 0,sd = 1)))
return(lista)
}
results <- vector("list", simulations)
results[1] = test_function(seedX = seeds[1])
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
Warning message:
In results[1] = test_function(seedX = seeds[1]) :
number of items to replace is not a multiple of replacement length
Run Code Online (Sandbox Code Playgroud)
我究竟做错了什么?
谢谢!
我想合并一个SpatialPolygonsDataFrame:
# From https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html
states <- readOGR(dsn = "./cb_2014_us_state_20m.shp",
layer = "cb_2014_us_state_20m", verbose = FALSE)
Run Code Online (Sandbox Code Playgroud)
使用普通数据框:
my_counts <- data.frame(
State = c(
"CA", "TX", "IL", "FL", "NY", "OH",
"NJ", "GA", "MI", "PA", "MA", "CO", "AZ", "NC", "VA", "WA", "IN",
"MD", "MN", "WI", "MO", "TN", "IA", "KY", "LA", "SC", "CT", "AL",
"KS", "OR", "OK", "AR", "NV", "UT", "NE", "ID", "MS", "DC", "NM",
"NH", "ME", "AK", "RI", "MT", "HI", "WV", "SD", "ND", "DE", "VT",
"WY", "PR", "GU", "VI", "MP", "AS", …Run Code Online (Sandbox Code Playgroud) 我想每天在下面的图表中打勾,而不是每2天.
df1 <- structure(
list(
Timestamp = structure(
c(
1441837436, 1441843661,
1441885583, 1441966341, 1441985621, 1442048926, 1442321691, 1442329081,
1442349761, 1442408140, 1442417679, 1442508871, 1442513339, 1442513395,
1442514010, 1442525088, 1442553226, 1442562304
), tzone = "UTC", class = c("POSIXct",
"POSIXt")
), number = 7:24
), class = "data.frame", row.names = c(NA,-18L), .Names = c("Timestamp", "number")
)
ggplot(df1, aes(x = Timestamp, y = number)) +
geom_line(size=2) + geom_point(size=5) +
scale_y_continuous(breaks = seq(0, 50, by = 2))
Run Code Online (Sandbox Code Playgroud)
我尝试添加,+ scale_x_date(breaks = "1 day")但我收到以下错误:
Error: Invalid input: …Run Code Online (Sandbox Code Playgroud) 我的数据看起来像这样:
df1 <-
structure(
list(
y = c(-0.19, 0.3,-0.05, 0.15,-0.05, 0.15),
lb = c(-0.61,
0.1,-0.19,-0.06,-0.19,-0.06),
ub = c(0.22, 0.51, 0.09, 0.36,
0.09, 0.36),
x = structure(
c(1L, 2L, 1L, 2L, 1L, 2L),
.Label = c("X1",
"X2"),
class = "factor"
),
Group = c("A", "A", "B", "B", "C",
"C")
),
.Names = c("y", "lb", "ub", "x", "Group"),
row.names = c(NA,-6L),
class = "data.frame"
)
Run Code Online (Sandbox Code Playgroud)
我想使用ggplot2到plotthe点x,y有色的group错误吧lb, ub.因为x是离散的,我想jitter这样点和条不重叠.现在,我可以jitter点,但不是线.另外,我希望点的顺序是A,B,C
ggplot(data …Run Code Online (Sandbox Code Playgroud) 我有一个数组告诉我每个国家的观察数量.
countries <- structure(c(532L, 3L, 1L, 15L, 1L, 1L, 2L, 3L, 16L, 2L, 43L,
1L, 2L, 2L, 1L, 1L, 1L, 3L, 2L, 1L, 4L, 4L, 16L, 13L, 2L, 2L,
9L, 1L, 1L, 5L, 3L, 5L, 1L, 1L, 3L, 1L, 10L, 11L, 4L, 2L, 1L,
7L, 1L, 2L, 6L, 7L, 1L, 6L, 1L, 2L, 7L, 1L, 20L, 1L, 2L, 1L,
3L, 2L, 5L, 76L, 2L, 1L, 1L), .Dim = 63L, .Dimnames = structure(list(
c("United States", "Argentina", "Armenia", "Australia", "Austria",
"Bangladesh", "Belarus", …Run Code Online (Sandbox Code Playgroud) 假设我的数据如下所示:
df1 = data.frame(A=c(1000000.51,5000.33), B=c(0.565,0.794))
Run Code Online (Sandbox Code Playgroud)
我想使用DataTables并且列A是(1,000,001; 5,000)
library(DT)
datatable(df1) %>% formatPercentage('B', 2) %>%
formatRound('A',digits = 0)
Run Code Online (Sandbox Code Playgroud)
我知道我可以使用秤
library(scales)
comma_format()(1000000)
Run Code Online (Sandbox Code Playgroud)
但我不确定如何将其与DataTables结合起来
谢谢!
我可以dplyr用来连接sqlite数据库:
library(dplyr)
mydb<- src_sqlite("DATA/mydb.db")
Run Code Online (Sandbox Code Playgroud)
如何列出表格mydb?我在帮助文件中找不到任何相关内容
我有一个看起来像这样的数据框(有更多的观察)
df <- structure(list(session_user_id = c("1803f6c3625c397afb4619804861f75268dfc567",
"1924cb2ebdf29f052187b9a2d21673e4d314199b", "1924cb2ebdf29f052187b9a2d21673e4d314199b",
"1924cb2ebdf29f052187b9a2d21673e4d314199b", "1924cb2ebdf29f052187b9a2d21673e4d314199b",
"198b83b365fef0ed637576fe1bde786fc09817b2", "19fd8069c094fb0697508cc9646513596bea30c4",
"19fd8069c094fb0697508cc9646513596bea30c4", "19fd8069c094fb0697508cc9646513596bea30c4",
"19fd8069c094fb0697508cc9646513596bea30c4", "1a3d33c9cbb2aa41515e6ef76f123b2ea8ee2f13",
"1b64c142b1540c43e3f813ccec09cb2dd7907c14", "1b7346d13f714c97725ba2e1c21b600535164291"
), raw_score = c(1, 1, 1, 1, 1, 0.2, NA, 1, 1, 1, 1, 0.2, 1),
submission_time = c(1389707078L, 1389694184L, 1389694188L,
1389694189L, 1389694194L, 1390115495L, 1389696939L, 1389696971L,
1389741306L, 1389985033L, 1389983862L, 1389854836L, 1389692240L
)), .Names = c("session_user_id", "raw_score", "submission_time"
), row.names = 28:40, class = "data.frame")
Run Code Online (Sandbox Code Playgroud)
我想创建一个新的数据框,每个"session_ user_id"只有一个观察,保持一个具有最新的"submission_time".
我想到的唯一想法是创建一个唯一用户列表.编写一个循环来查找每个用户的submission_time的最大值,然后编写一个循环,在该用户和时间之前得到原始分数.
有人能告诉我在R中这样做的更好方法吗?
谢谢!
我有2个向量,x并且y:
x <- seq(from=21.6, to=22, by=0.01)
y <- seq(from=58.77, to=58.93, by=0.01)
Run Code Online (Sandbox Code Playgroud)
我想创建一个数据帧df1<-data.frame(x,y)与所有可能的组合x和y.我怎样才能做到这一点?
对于以下情节:
df.plot <-structure(list(color = structure(c(2L, 2L, 3L, 1L, 3L, 4L, 3L,
1L, 4L, 1L, 2L, 4L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 2L,
3L, 3L, 3L, 3L), .Label = c("54", "55", "61", "69"), class = "factor"),
date = structure(c(16687, 16687, 16687, 16687, 16687, 16687,
16688, 16688, 16688, 16689, 16689, 16690, 16693, 16693, 16693,
16694, 16694, 16695, 16695, 16695, 16695, 16696, 16696, 16696,
16696, 16696, 16696), class = "Date"), facet = c("A",
"A", "A", "A", "A", "B", …Run Code Online (Sandbox Code Playgroud)