只是发布这个,所以我可以稍后搜索它,因为它似乎总是让我感到困惑:
$ python3.2
Python 3.2 (r32:88445, Oct 20 2012, 14:09:50)
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import curses
>>> print(curses.version)
b'2.2'
>>> print(str(curses.version))
b'2.2'
>>> print(curses.version.encode('utf-8'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'bytes' object has no attribute 'encode'
>>> print(str(curses.version).encode('utf-8'))
b"b'2.2'"
Run Code Online (Sandbox Code Playgroud)
问题:如何bytes在Python 3中打印二进制()字符串,没有b'前缀?
我正在导入的一堆推文在他们阅读时遇到了这个问题
b'I posted a new photo to Facebook'
Run Code Online (Sandbox Code Playgroud)
我收集b指示它是一个字节.但这证明是有问题的,因为在我最终编写的CSV文件中,b它不会消失,并且会干扰未来的代码.
有没有一种简单的方法可以b从我的文本行中删除这个前缀?
请记住,我似乎需要将文本编码为utf-8或tweepy,无法将其从网络中提取出来.
这是我正在分析的链接内容:
https://www.dropbox.com/s/sjmsbuhrghj7abt/new_tweets.txt?dl=0
new_tweets = 'content in the link'
Run Code Online (Sandbox Code Playgroud)
outtweets = [[tweet.text.encode("utf-8").decode("utf-8")] for tweet in new_tweets]
print(outtweets)
Run Code Online (Sandbox Code Playgroud)
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-21-6019064596bf> in <module>()
1 for screen_name in user_list:
----> 2 get_all_tweets(screen_name,"instance file")
<ipython-input-19-e473b4771186> in get_all_tweets(screen_name, mode)
99 with open(os.path.join(save_location,'%s.instance' % screen_name), 'w') as f:
100 writer = csv.writer(f)
--> 101 writer.writerows(outtweets)
102 else:
103 with open(os.path.join(save_location,'%s.csv' % screen_name), 'w') …Run Code Online (Sandbox Code Playgroud)