AttributeError：“ HTTPResponse”对象没有属性“ split”

Question

AttributeError：“ HTTPResponse”对象没有属性“ split”

我正在尝试从Google财经获取一些信息，但出现此错误

AttributeError：“ HTTPResponse”对象没有属性“ split”

这是我的python代码：

import urllib.request
import urllib
from bs4 import BeautifulSoup

symbolsfile = open("Stocklist.txt")

symbolslist = symbolsfile.read()

thesymbolslist = symbolslist.split("\n")

i=0


while i<len (thesymbolslist):
    theurl = "http://www.google.com/finance/getprices?q=" + thesymbolslist[i] + "&i=10&p=25m&f=c"
    thepage = urllib.request.urlopen (theurl)
    print(thesymbolslist[i] + " price is " + thepage.split()[len(thepage.split())-1])
    i= i+1

Run Code Online (Sandbox Code Playgroud)

Answer 1

Aks*_*jan 5

问题的原因

这是因为urllib.request.urlopen (theurl)返回表示连接的对象，而不是字符串。

解决方案

要从此连接读取数据并实际获取字符串，您需要

thepage = urllib.request.urlopen(theurl).read()

Run Code Online (Sandbox Code Playgroud)

然后其余的代码应该自然地遵循。

解决方案附录

有时，字符串本身包含无法识别的字符编码字形，在这种情况下，Python会将其转换为bytestring。

解决该问题的正确方法是找到正确的字符编码，并使用它来将字节字符串解码为常规字符串，如以下问题所示：

thepage = urllib.request.urlopen(theurl)
# read the correct character encoding from `Content-Type` request header
charset_encoding = thepage.info().get_content_charset()
# apply encoding
thepage = thepage.read().decode(charset_encoding)

Run Code Online (Sandbox Code Playgroud)

有时可以安全地假设字符编码为 utf-8，在这种情况下

thepage = urllib.request.urlopen(theurl).read().decode('utf-8')

Run Code Online (Sandbox Code Playgroud)

确实比不经常工作。如果没有别的，这是一个统计上不错的猜测。

归档时间：	9 年，6 月前
查看次数：	10828 次
最近记录：	9 年，6 月前