我正在努力将Twitter搜索结果保存到数据库(SQL Server)中,当我从twitteR中提取搜索结果时出现错误.
如果我执行:
library(twitteR)
puppy <- as.data.frame(searchTwitter("puppy", session=getCurlHandle(),num=100))
Run Code Online (Sandbox Code Playgroud)
我得到一个错误:
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class structure("status", package = "twitteR") into a data.frame
Run Code Online (Sandbox Code Playgroud)
这很重要,因为为了使用RODBC将其添加到使用sqlSave的表中,它需要是一个data.frame.至少那是我得到的错误信息:
Error in sqlSave(localSQLServer, puppy, tablename = "puppy_staging", :
should be a data frame
Run Code Online (Sandbox Code Playgroud)
那么有没有人有关于如何将列表强制到data.frame或如何通过RODBC加载列表的任何建议?
我的最终目标是拥有一个反映searchTwitter返回值结构的表.以下是我要检索和加载的示例:
library(twitteR)
puppy <- searchTwitter("puppy", session=getCurlHandle(),num=2)
str(puppy)
List of 2
$ :Formal class 'status' [package "twitteR"] with 10 slots
.. ..@ text : chr "beautifull and kc reg Beagle Mix for rehomes: This little puppy is looking for a new loving family wh... http://bit.ly/9stN7V "| __truncated__
.. ..@ favorited : logi FALSE
.. ..@ replyToSN : chr(0)
.. ..@ created : chr "Wed, 16 Jun 2010 19:04:03 +0000"
.. ..@ truncated : logi FALSE
.. ..@ replyToSID : num(0)
.. ..@ id : num 1.63e+10
.. ..@ replyToUID : num(0)
.. ..@ statusSource: chr "<a href="http://twitterfeed.com" rel="nofollow">twitterfeed</a>"
.. ..@ screenName : chr "puppy_ads"
$ :Formal class 'status' [package "twitteR"] with 10 slots
.. ..@ text : chr "the cutest puppy followed me on my walk, my grandma won't let me keep it. taking it to the pound sadface"
.. ..@ favorited : logi FALSE
.. ..@ replyToSN : chr(0)
.. ..@ created : chr "Wed, 16 Jun 2010 19:04:01 +0000"
.. ..@ truncated : logi FALSE
.. ..@ replyToSID : num(0)
.. ..@ id : num 1.63e+10
.. ..@ replyToUID : num(0)
.. ..@ statusSource: chr "<a href="http://blackberry.com/twitter" rel="nofollow">Twitter for BlackBerry®</a>"
.. ..@ screenName : chr "iamsweaters"
Run Code Online (Sandbox Code Playgroud)
所以我认为小狗的data.frame应该有列名,如:
- text
- favorited
- replytoSN
- created
- truncated
- replytoSID
- id
- replytoUID
- statusSource
- screenName
Run Code Online (Sandbox Code Playgroud)
ARo*_*son 17
我使用我在http://blog.ouseful.info/2011/11/09/getting-started-with-twitter-analysis-in-r/发现的代码:
#get data
tws<-searchTwitter('#keyword',n=10)
#make data frame
df <- do.call("rbind", lapply(tws, as.data.frame))
#write to csv file (or your RODBC code)
write.csv(df,file="twitterList.csv")
Run Code Online (Sandbox Code Playgroud)
我知道这是一个老问题,但是,我认为这是一个解决这个问题的"现代"版本.只需使用该功能twListToDf
gvegayon <- getUser("gvegayon")
timeline <- userTimeline(gvegayon,n=400)
tl <- twListToDF(timeline)
Run Code Online (Sandbox Code Playgroud)
希望能帮助到你
尝试这个:
ldply(searchTwitter("#rstats", n=100), text)
Run Code Online (Sandbox Code Playgroud)
twitteR 返回一个 S4 类,因此您需要使用它的辅助函数之一,或者直接处理它的槽。您可以使用 查看插槽unclass(),例如:
unclass(searchTwitter("#rstats", n=100)[[1]])
Run Code Online (Sandbox Code Playgroud)
可以像我上面那样通过使用相关函数直接访问这些插槽(来自 twitter 帮助:?statusSource):
Run Code Online (Sandbox Code Playgroud)text Returns the text of the status favorited Returns the favorited information for the status replyToSN Returns the replyToSN slot for this status created Retrieves the creation time of this status truncated Returns the truncated information for this status replyToSID Returns the replyToSID slot for this status id Returns the id of this status replyToUID Returns the replyToUID slot for this status statusSource Returns the status source for this status
正如我所提到的,据我所知,您必须在输出中自己指定每个字段。以下是使用其中两个字段的示例:
> head(ldply(searchTwitter("#rstats", n=100),
function(x) data.frame(text=text(x), favorited=favorited(x))))
text
1 @statalgo how does that actually work? does it share mem between #rstats and postgresql?
2 @jaredlander Have you looked at PL/R? You can call #rstats from PostgreSQL: http://www.joeconway.com/plr/.
3 @CMastication I was hoping for a cool way to keep data in a DB and run the normal #rstats off that. Maybe a translator from R to SQL code.
4 The distribution of online data usage: AT&T has recently announced it will no longer http://goo.gl/fb/eTywd #rstat
5 @jaredlander not that I know of. Closest is sqldf package which allows #rstats and sqlite to share mem so transferring from DB to df is fast
6 @CMastication Can #rstats run on data in a DB?Not loading it in2 a dataframe or running SQL cmds but treating the DB as if it wr a dataframe
favorited
1 FALSE
2 FALSE
3 FALSE
4 FALSE
5 FALSE
6 FALSE
Run Code Online (Sandbox Code Playgroud)
如果您打算经常这样做,您可以将其变成一个函数。