小编lit*_*ger的帖子

将频率表组合成单个数据帧

我有一个列表,其中每个列表项是在不同的示例文本上使用"table()"派生的单词频率表.因此,每个表的长度不同.我现在想将列表转换为单个数据框,其中每列是一个单词,每一行都是一个示例文本.这是我的数据的一个虚拟示例:

t1<-table(strsplit(tolower("this is a test in the event of a real word file you would see many more words here"), "\\W"))

t2<-table(strsplit(tolower("Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal"), "\\W"))

t3<-table(strsplit(tolower("Ask not what your country can do for you - ask what you can do for your country"), "\\W"))

myList <- list(t1, t2, t3)

Run Code Online (Sandbox Code Playgroud)

所以,人们会得到这种结构:

> class(myList[[3]])
[1] …

Run Code Online (Sandbox Code Playgroud)

r plyr

lit*_*ger

2017 05-03

6
推荐指数

2
解决办法

2998
查看次数

我正在使用R中的xml2包来解析一些非常大的xml文件。read_xml（）成功加载了大文件，但是当我尝试使用xml_find_all（）时，出现“错误：内存分配失败：节点集命中限制越来越大。” 我假设此限制是在libxml2中设置的，也许是在XPATH_MAX_NODESET_LENGTH var中设置的？所以也许这不是xml2包本身的问题。但是xml2内是否有解决方案？我尝试删除节点并释放内存而没有运气。谢谢。

r libxml2 xml2

lit*_*ger

lucky-day

1
推荐指数

1
解决办法

1443
查看次数