该tm包扩展c,使得如果给定一组PlainTextDocument的IT自动创建Corpus.不幸的是,似乎每个都PlainTextDocument必须单独指定.
例如,如果我有:
foolist <- list(a, b, c); # where a,b,c are PlainTextDocument objects
Run Code Online (Sandbox Code Playgroud)
我这样做是为了得到一个Corpus:
foocorpus <- c(foolist[[1]], foolist[[2]], foolist[[3]]);
Run Code Online (Sandbox Code Playgroud)
我有一个列表的列表'PlainTextDocument,看起来像这样:
> str(sectioned)
List of 154
$ :List of 6
..$ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:1] Developing assessment models Developing models
.. .. ..- attr(*, "Author")= chr "John Smith"
.. .. ..- attr(*, "DateTimeStamp")= POSIXlt[1:1], format: "2013-04-30 12:03:49"
.. .. ..- attr(*, "Description")= chr(0)
.. .. …Run Code Online (Sandbox Code Playgroud) 我有一个列表,里面有嵌套列表.
LIST2 <- list(list("USA","WY","TX","AZ","Canada", "CA", "NY", 'Russia', 'NY'),
list(c("USA","Canada","CA","WY", 'China', 'AZ', 'AZ', 'AZ', 'WY')),
list(c("USA","Australia","CA","AR", 'AZ', 'WY', 'New Zealand', 'Japan', 'Japan', 'NJ')),
list(list('Australia', 'Australia', 'Japan', 'Malaysia' )),
list(c('USA', 'Australia', 'Japan', 'Malaysia' )))
Run Code Online (Sandbox Code Playgroud)
我想以某种方式弄平第1和第4个列表,因此它们与其余部分的形式相同.这可能吗?