小编Hol*_*rel的帖子

使用Python urllib2的个人访问令牌访问Github API

我正在访问Github API v3,它工作正常,直到我达到速率限制,所以我从Github设置页面创建了一个个人访问令牌.我正在尝试使用urllib2的令牌和以下代码:

from urllib2 import urlopen, Request

url = "https://api.github.com/users/vhf/repos"
token = "my_personal_access_token"
headers = {'Authorization:': 'token %s' % token}
#headers = {}

request = Request(url, headers=headers)
response = urlopen(request)
print(response.read())

Run Code Online (Sandbox Code Playgroud)

如果我取消注释注释行(直到我达到每小时60个请求的速率限制),此代码可以正常工作.但是当我按原样运行代码时urllib2.HTTPError: HTTP Error 401: Unauthorized

我究竟做错了什么？

python api authorization github urllib2

Hol*_*rel

lucky-day

15
推荐指数

2
解决办法

9019
查看次数

使用美丽的汤保留&#bsp; 实体

我想从网上刮一张桌子,并保持实体完好无损,以便我以后可以重新发布为HTML.尽管如此,BeautifulSoup似乎正在将这些转换为空间.例:

from bs4 import BeautifulSoup

html = "<html><body><table><tr>"
html += "<td>&nbsp;hello&nbsp;</td>"
html += "</tr></table></body></html>"

soup = BeautifulSoup(html)
table = soup.find_all('table')[0]
row = table.find_all('tr')[0]
cell = row.find_all('td')[0]

print cell

Run Code Online (Sandbox Code Playgroud)

观察结果:

<td> hello </td>

Run Code Online (Sandbox Code Playgroud)

要求的结果:

<td>&nbsp;hello&nbsp;</td>

Run Code Online (Sandbox Code Playgroud)

python beautifulsoup html-parsing html-entities web-scraping

Hol*_*rel

2016 12-20

10
推荐指数

1
解决办法

2378
查看次数

在python中取消慢下载

我正在通过http下载文件并使用urllib和以下代码显示进度 - 这很好用:

import sys
from urllib import urlretrieve

urlretrieve('http://example.com/file.zip', '/tmp/localfile', reporthook=dlProgress)

def dlProgress(count, blockSize, totalSize):
  percent = int(count*blockSize*100/totalSize)
  sys.stdout.write("\r" + "progress" + "...%d%%" % percent)
  sys.stdout.flush()

Run Code Online (Sandbox Code Playgroud)

现在我还想重新启动下载,如果它太慢(比如15秒内不到1MB).我怎样才能做到这一点？

python restart urllib download

Hol*_*rel

2012 08-23

8
推荐指数

1
解决办法

1957
查看次数

在Python 2.7中将长行文本分成固定宽度的行

我如何在可能的空间分解长字符串,如果没有,插入连字符,除了第一行以外的所有行都有缩进？

所以,对于一个工作函数,breakup():

splitme = "Hello this is a long string and it may contain an extremelylongwordlikethis bye!"
breakup(bigline=splitme, width=20, indent=4)

Run Code Online (Sandbox Code Playgroud)

输出:

Hello this is a long
    string and it
    may contain an
    extremelylongwo-
    rdlikethis bye!

Run Code Online (Sandbox Code Playgroud)

python string format text python-2.7

Hol*_*rel

2013 05-01

2
推荐指数

1
解决办法

2073
查看次数

Python regexp用于捕获由空格和逗号分隔的数字和短划线

我需要捕捉标记,如11,12- -13和14-15

我希望拒绝任何包含上面未指定的无效令牌的字符串,例如12--和.4-5-6 这些字符串可以由任意数量的空格分隔,这些空格可能包含也可能不包含单个彗差.所以对于字符串:

43,5 67- -66,53-53 , 6

我想回来

('43', '5', '67-', '-66', '53-53', '6')

Run Code Online (Sandbox Code Playgroud)

这是我尝试过的:

import re

num = r'\d{1,4}'
token = r'(?:-%s)|(?:%s-%s)|(?:%s-)|(?:%s)' % (num, num, num, num, num)
sep = r'\s*,?\s*'
valid = r'(%s)(?:%s(%s))*' % (token, sep, token)

test = re.compile(valid)
m = test.match("43,5 67-  -66,53-53 , 6")
print(m.groups())

Run Code Online (Sandbox Code Playgroud)

但它只打印第一个和最后一个数字: