Han*_*ana 3 python unicode twitter tweepy python-2.7
检索特定阿拉伯语关键字的Twitter数据时,如下所示:
#imports
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
#setting up the keys
consumer_key = '………….'
consumer_secret = '…………….'
access_token = '…………..'
access_secret = '……...'
class TweetListener(StreamListener):
# A listener handles tweets are the received from the stream.
#This is a basic listener that just prints received tweets to standard output
def on_data(self, data):
print (data)
return True
def on_error(self, status):
print (status)
#printing all the tweets to the standard output
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
stream = Stream(auth, TweetListener())
stream.filter(track=['?????'])
Run Code Online (Sandbox Code Playgroud)
我收到此错误消息:
Traceback (most recent call last):
File "/Users/Mona/Desktop/twitter.py", line 29, in <module>
stream.filter(track=['?????'])
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site- packages/tweepy/streaming.py", line 303, in filter
encoded_track = [s.encode(encoding) for s in track]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd8 in position 0: ordinal not in range(128)
Run Code Online (Sandbox Code Playgroud)
请任何帮助!!
我查看了tweepy的源代码,发现Stream的源代码中的行似乎导致了问题.该行来自过滤方法.当你调用stream.filter(track=['?????'])你的代码时,Stream调用
s.encode('utf-8')s ='سوريا'(查看过滤器的源代码,你将utf-8作为默认编码).此时代码抛出异常.
要解决此问题,我们需要使用Unicode字符串.
t = u"?????"
stream.filter(track=[t])
Run Code Online (Sandbox Code Playgroud)
(为了清楚起见,我只是将你的字符串放入变量t).