题
使用dplyr,如何在一个语句中选择分组数据的顶部和底部观察/行?
数据和示例
给定一个数据框架
df <- data.frame(id=c(1,1,1,2,2,2,3,3,3),
stopId=c("a","b","c","a","b","c","a","b","c"),
stopSequence=c(1,2,3,3,1,4,3,1,2))
Run Code Online (Sandbox Code Playgroud)
我可以使用每个组的顶部和底部观察结果slice,但使用两个单独的语句:
firstStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(1) %>%
ungroup
lastStop <- df %>%
group_by(id) %>%
arrange(stopSequence) %>%
slice(n()) %>%
ungroup
Run Code Online (Sandbox Code Playgroud)
我可以将这两个statmenets合并成一个选择两个顶部和底部的意见?
情况和数据
我df在比赛中有一个运动员位置的数据框(我已经将melted它用于ggplot2):
df <- structure(list(athlete = c("A", "B", "C", "D", "E", "F", "G",
"H", "I", "J", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J",
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "A", "B", "C",
"D", "E", "F", "G", "H", "I", "J", "A", "B", "C", "D", "E", "F",
"G", "H", "I", "J", "A", "B", "C", "D", "E", "F", "G", "H", "I",
"J"), distanceRemaining = structure(c(1L, 1L, 1L, 1L, 1L, 1L, …Run Code Online (Sandbox Code Playgroud) 我想了解rolling joins在data.table.最后给出了重现这一点的数据.
给出机场交易的数据表,在给定时间:
> dt
t_id airport thisTime
1: 1 a 5.1
2: 3 a 5.1
3: 2 a 6.2
Run Code Online (Sandbox Code Playgroud)
(注t_ids1和3有相同的机场和时间)
以及从机场起飞的航班查询表:
> dt_lookup
f_id airport thisTime
1: 1 a 6
2: 2 a 6
3: 1 b 7
4: 1 c 8
5: 2 d 7
6: 1 d 9
7: 2 e 8
> tables()
NAME NROW NCOL MB COLS KEY
[1,] dt 3 3 1 t_id,airport,thisTime airport,thisTime
[2,] dt_lookup 7 3 …Run Code Online (Sandbox Code Playgroud) 题
在R Markdown(.Rmd)文档的代码块中,如何解析包含换行符的字符串\n,以在新行上显示文本?
数据和示例
我想解析text <- "this is\nsome\ntext"显示为:
this is
some
text
Run Code Online (Sandbox Code Playgroud)
这是一个示例代码块,只有几次尝试(不产生所需的输出):
```{r, echo=FALSE, results='asis'}
text <- "this is\nsome\ntext" # This is the text I would like displayed
cat(text, sep="\n") # combines all to one line
print(text) # ignores everything after the first \n
text # same as print
```
Run Code Online (Sandbox Code Playgroud)
附加信息
该文本将来自闪亮应用程序上的用户输入.
例如,ui.R
tags$textarea(name="txt_comment") ## comment box for user input
Run Code Online (Sandbox Code Playgroud)
然后我有一个download使用.Rmd文档来呈现输入的按钮:
```{r, echo=FALSE, results='asis'}
input$txt_comment
```
Run Code Online (Sandbox Code Playgroud)
R Studio画廊就是一个 …
我希望用它data.table来提高给定函数的速度,但我不确定我是以正确的方式实现它:
数据
鉴于两个data.tables(dt和dt_lookup)
library(data.table)
set.seed(1234)
t <- seq(1,100); l <- letters; la <- letters[1:13]; lb <- letters[14:26]
n <- 10000
dt <- data.table(id=seq(1:n),
thisTime=sample(t, n, replace=TRUE),
thisLocation=sample(la,n,replace=TRUE),
finalLocation=sample(lb,n,replace=TRUE))
setkey(dt, thisLocation)
set.seed(4321)
dt_lookup <- data.table(lkpId = paste0("l-",seq(1,1000)),
lkpTime=sample(t, 10000, replace=TRUE),
lkpLocation=sample(l, 10000, replace=TRUE))
## NOTE: lkpId is purposly recycled
setkey(dt_lookup, lkpLocation)
Run Code Online (Sandbox Code Playgroud)
我有找到的函数lkpId同时包含thisLocation和finalLocation,并具有"最近" lkpTime(即最小的非负的值thisTime - lkpTime)
功能
## function to get the 'next' lkpId (i.e. …Run Code Online (Sandbox Code Playgroud) 背景
(问题不是必需的,但可能对阅读有用)
数据
library(data.table) ## using version 1.9.6
## arrival timetable
dt_arrive <- structure(list(txn_id = c(1L, 1L, 1L, 1L, 1L), place = c("place_a",
"place_a", "place_a", "place_a", "place_a"), arrival_minutes = c(515,
534, 547, 561, 581), journey_id = 1:5), .Names = c("txn_id",
"place", "arrival_minutes", "journey_id"), class = c("data.table",
"data.frame"), row.names = c(NA, -5L), sorted = c("txn_id",
"place"))
## departure timetable
dt_depart <- structure(list(txn_id = c(1L, 1L, 1L, 1L), place = c("place_a",
"place_a", "place_a", "place_a"), arrival_minutes = c(489, 507,
519, 543), …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用GKOctree在 3D 空间中有效检索对象。但是,以下代码似乎没有按预期工作:
import GameplayKit
let tree = GKOctree(boundingBox: GKBox(
boxMin: vector_float3(x: -10, y: -10, z: -10),
boxMax: vector_float3(x: 10, y: 10, z: 10)
), minimumCellSize: 0.1)
tree.add(NSObject(), at: vector_float3(x: 0, y: 0, z: 0))
tree.elements(at: vector_float3(x: 0, y: 0, z: 0)).count // 1, fine
tree.elements(in: GKBox(
boxMin: vector_float3(x: -1, y: -1, z: -1),
boxMax: vector_float3(x: 1, y: 1, z: 1)
)).count // 0, ??
tree.elements(in: GKBox(
boxMin: vector_float3(x: 1, y: 1, z: 1),
boxMax: vector_float3(x: -1, y: …Run Code Online (Sandbox Code Playgroud) 题
是否有一个功能,或rowSums只能在一个列上工作的方法?
示例数据
col1 <- c(1,2,3)
col2 <- c(1,2,3)
df <- data.frame(col1, col2)
Run Code Online (Sandbox Code Playgroud)
我可以使用rowSums两行或更多定义列的行中的每个值相加:
colsToAdd <- c("col1", "col2")
rowSums(df[,colsToAdd])
[1] 2 4 6
Run Code Online (Sandbox Code Playgroud)
但是,仅在传递列时失败
colsToAdd <- c("col1")
rowSums(df[,colsToAdd])
Error in rowSums(df[, colsToAdd]) :
'x' must be an array of at least two dimensions
Run Code Online (Sandbox Code Playgroud)
这看起来很有意义rowSums():
> rowSums
function (x, na.rm = FALSE, dims = 1L)
{
if (is.data.frame(x))
x <- as.matrix(x)
if (!is.array(x) || length(dn <- dim(x)) < 2L)
## This line 'stops' the function …Run Code Online (Sandbox Code Playgroud) 有没有办法用来summarise_each()计算数据框中的记录数,但忽略NAs?
示例/样本数据
df_sample <- structure(list(var_1 = c(NA, NA, NA, NA, 1, NA), var_2 = c(NA,
NA, NA, NA, 2, 1), var_3 = c(NA, NA, NA, NA, 3, 2), var_4 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), var_5 = c(NA,
NA, NA, NA, 4, 3)), .Names = c("var_1", "var_2", "var_3", "var_4",
"var_5"), row.names = 5:10, class = "data.frame")
> df_samp
var_1 var_2 var_3 var_4 var_5
5 NA NA NA NA NA
6 NA NA NA NA NA …Run Code Online (Sandbox Code Playgroud) 我使用它越多data.table,替换dplyr为我的'goto'包就越多,因为它提供的速度是一大优点.
题
你可以i在data.table(dt[i,j])中传递变量而不创建一个expression?
例
给出data.table:
library(data.table)
dt <- data.table(val1 = c(1,2,3),
val2 = c(3,2,1))
Run Code Online (Sandbox Code Playgroud)
我想评估一下:
dt[(val1 > val2)]
Run Code Online (Sandbox Code Playgroud)
但使用变量来引用列名.例如,
myCol <- c("val1", "val2") ## vector of column names
Run Code Online (Sandbox Code Playgroud)
我已经阅读了很多问题,通过表达式显示了这样做的方法:
## create an expression to evaluate
expr <- parse(text = paste0(myCol[1], " > ", myCol[2]))
## evaluate expression
dt[(eval(expr))]
val1 val2
1: 3 1
Run Code Online (Sandbox Code Playgroud)
但是我想知道是否有一种更"直接"的方法可以做到这一点,我错过了,类似于:
dt[(myCol[1] > myCol[2])]
Run Code Online (Sandbox Code Playgroud)
或者是expression应该这样做的路线?