对 Rcpp 中字符串的连接感到困惑

Tig*_*pes 2 string rcpp

我正在尝试遍历数据框并连接由 Rcpp 中的空格分隔的字块。

我尝试阅读 Stack Overflow 上的一些答案,但对 Rcpp 中字符串的连接方式感到非常困惑。(例如用 Rcpp 连接 StringVector

我知道在 C++ 中你可以只使用 + 运算符来添加字符串。

这是我下面的 Rcpp 函数

cppFunction('
Rcpp::StringVector formTextBlocks(DataFrame frame) {
#include <string> 
using namespace Rcpp;
 NumericVector frame_x = as<NumericVector>(frame["x"]);

   LogicalVector space = as<LogicalVector>(frame["space"]);
   Rcpp::StringVector text=as<StringVector>(frame["text"]);
  if (text.size() == 0) {
    return text;
  }
  int dfSize = text.size();

  for(int i = 0;  i < dfSize; ++i) {
    if ( i !=dfSize  ) {
     if (space[i]==true) {

     text[i]=text[i] + text[i+1]  ;

    }
  }

  }
  return text;
}
')
Run Code Online (Sandbox Code Playgroud)

错误是在 error: no match for 'operator+'

如何在循环内连接字符串?

Ral*_*ner 5

由于operator+是为 定义的std::string,因此最简单的方法是将text列转换为std::vector<std::string>而不是来使用它Rcpp::StringVector

Rcpp::cppFunction('
std::vector<std::string> formTextBlocks(DataFrame frame) {
  LogicalVector space = as<LogicalVector>(frame["space"]);
  std::vector<std::string> text=as<std::vector<std::string>>(frame["text"]);
  if (text.size() == 0) {
    return text;
  }
  int dfSize = text.size();

  for(int i = 0;  i < dfSize - 1; ++i) {
    if (space[i]==true) {
      text[i]=text[i] + text[i+1];
    }
  }
  return text;
}
')

set.seed(20191129)
textBlock <- data.frame(space = sample(c(TRUE, FALSE), 100, replace = TRUE),
                        text = sample(LETTERS, 100, replace = TRUE),
                        stringsAsFactors = FALSE)
formTextBlocks(textBlock)
#>   [1] "B"  "N"  "G"  "BM" "M"  "O"  "C"  "F"  "OQ" "Q"  "FH" "H"  "D"  "HK" "KH"
#>  [16] "H"  "S"  "LX" "XO" "OY" "Y"  "E"  "VD" "D"  "TN" "N"  "LL" "LQ" "Q"  "F" 
#>  [31] "XX" "X"  "S"  "R"  "P"  "L"  "M"  "GK" "KD" "DD" "D"  "H"  "M"  "M"  "K" 
#>  [46] "N"  "GP" "PG" "G"  "P"  "G"  "O"  "N"  "NY" "Y"  "OX" "X"  "LX" "XF" "FS"
#>  [61] "SE" "E"  "PS" "S"  "YD" "D"  "F"  "Z"  "H"  "ZN" "N"  "OM" "M"  "XH" "HV"
#>  [76] "V"  "OX" "X"  "J"  "BZ" "Z"  "FZ" "ZE" "E"  "SV" "V"  "G"  "F"  "DZ" "ZF"
#>  [91] "F"  "PB" "B"  "K"  "N"  "U"  "B"  "PV" "V"  "C"
Run Code Online (Sandbox Code Playgroud)

reprex 包(v0.3.0)于 2019 年 11 月 29 日创建

笔记:

  • 我已经删除了#includeusing。这些不是必需的,也不属于函数定义。
  • 我已经删除了i != dfSize测试,false无论如何都不会。
  • 循环的长度减一,因为您正在接触 element i+1