use*_*193 5 r multidimensional-array
我有一些带有 4 维数组的代码,我需要在多个维度上应用 which.max。它很慢,我想找到加快速度的方法。
例子:
library(microbenchmark)
array4d <- array( runif(5*500*50*5 ,-1,0),
dim = c(5, 500, 50, 5) )
microbenchmark(
max_idx <- apply(array4d, c(1,2,3), which.max )
)
Run Code Online (Sandbox Code Playgroud)
任何提示表示赞赏,谢谢!
编辑:通过直接在 for 循环中对其进行编码,我设法使其稍微快了一点(虽然丑陋) - 但我希望那里有人有更好的想法!
method1 <- function(z) {
apply(z, c(1,2,3), which.max)
}
method2 <- function(z){
result <- array( , dim = dim(z)[1:3] )
for(i in 1:dim(z)[1]){
for(j in 1:dim(z)[2]){
for(k in 1:dim(z)[3]){
result[i, j, k] <- which.max(z[i,j,k,])
}
}
}
return(result)
}
microbenchmark(
result1 <- method1(array4d),
result2 <- method2(array4d))
> microbenchmark(
+ result1 <- method1(array4d),
+ result2 <- method2(array4d)
+ )
Unit: milliseconds
expr min lq mean median uq max neval cld
result1 <- method1(array4d) 111.9061 140.1400 165.2441 155.6773 170.3967 384.6425 100 b
result2 <- method2(array4d) 113.4572 123.2429 136.8583 130.8505 141.9620 215.0968 100 a
Run Code Online (Sandbox Code Playgroud)
添加更多方法。一个使用 R,另一个调整代码来自 @Allan-Cameron:
method4 <- function(z){
result <- array(integer(1) , dim = head(dim(z), -1))
n <- prod(head(dim(z), -1))
j <- seq_len(tail(dim(z),1)) * n - n
for(i in seq_len(n)) result[i] <- which.max(z[i+j])
result}
Run Code Online (Sandbox Code Playgroud)
Rcpp::cppFunction("
NumericVector method5(const NumericVector &input){
std::vector<int> dims = input.attr(\"dim\");
int last_dim = dims.back();
int diff = input.size()/last_dim;
std::vector<int> result(diff);
dims.pop_back();
for(int i = 0; i < diff; ++i)
{
double max_val = input[i];
int max_ind = 0;
for(int j = 0; j < last_dim; ++j)
{
if(input[i+j*diff] > max_val) {
max_val = input[i+j*diff];
max_ind = j;
}
}
result[i] = max_ind + 1;
}
NumericVector arr = wrap(result);
arr.attr(\"dim\") = dims;
return arr ;
}"
)
Run Code Online (Sandbox Code Playgroud)
时间:
set.seed(42)
array4d <- array(runif(5*500*50*5, -1, 0), dim = c(5, 500, 50, 5))
library(microbenchmark)
microbenchmark(
check = "equal", control=list(order="block")
, method1(array4d) #Using code from Question
, method2(array4d) #Using code from Question
, apply_which_max(array4d) #Using code from Allan Cameron
, method4(array4d)
, method5(array4d)
)
#Unit: microseconds
# expr min lq mean median uq max neval cld
# method1(array4d) 200857.804 228567.850 266815.6275 254530.3050 294578.1125 423838.879 100 d
# method2(array4d) 144767.680 149616.981 162367.6556 150688.1860 182290.4980 315650.052 100 c
# apply_which_max(array4d) 3131.482 3153.712 3346.1025 3175.9445 3206.2220 5922.866 100 a
# method4(array4d) 58618.275 60777.584 62334.8258 61198.1815 61702.2170 165254.042 100 b
# method5(array4d) 894.823 902.862 972.2953 911.9845 927.0885 2643.957 100 a
Run Code Online (Sandbox Code Playgroud)
对于随机选择,而不是第一个匹配:
Rcpp::cppFunction("
NumericVector method6(const NumericVector &input){
std::srand(std::time(nullptr));
std::vector<int> dims = input.attr(\"dim\");
int last_dim = dims.back();
int diff = input.size()/last_dim;
std::vector<int> result(diff);
dims.pop_back();
for(int i = 0; i < diff; ++i)
{
double max_val = input[i];
std::vector<int> max_ind = {0};
for(int j = 1; j < last_dim; ++j)
{
if(input[i+j*diff] > max_val) {
max_val = input[i+j*diff];
max_ind.clear();
max_ind.push_back(j);
} else if(input[i+j*diff] == max_val) max_ind.push_back(j);
}
result[i] = max_ind[std::rand() % max_ind.size()] + 1;
}
NumericVector arr = wrap(result);
arr.attr(\"dim\") = dims;
return arr ;
}"
)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
116 次 |
| 最近记录: |