是什么导致"urlopen错误[Errno 13] Permission denied"错误?

Red*_*ket 2 python beautifulsoup

我试图在Centos7服务器上编写一个python(版本2.7.5)CGI脚本.我的脚本试图从librivox的网页上下载数据...就像https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/我的脚本一样,出现了这个错误:

<class 'urllib2.URLError'>: <urlopen error [Errno 13] Permission denied> 
      args = (error(13, 'Permission denied'),) 
      errno = None 
      filename = None 
      message = '' 
      reason = error(13, 'Permission denied') 
      strerror = None
Run Code Online (Sandbox Code Playgroud)

我已经关机iptables我可以做一些事情,比如`wget -O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/ '没有错误.以下是发生错误的代码:

def output_html ( url, appname, doobb ):
        print "url is %s<br>" % url
        soup = BeautifulSoup(urllib2.urlopen( url ).read())
Run Code Online (Sandbox Code Playgroud)

更新:感谢Paul和alecxe我更新了我的代码:

def output_html ( url, appname, doobb ):
        #hdr = {'User-Agent':'Mozilla/5.0'}
        #print "url is %s<br>" % url
        #req = url2lib2.Request(url, headers=hdr)
        # soup = BeautifulSoup(urllib2.urlopen( url ).read())
        headers = {'User-Agent':'Mozilla/5.0'}
        # headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'}
        response = requests.get( url, headers=headers)

        soup = BeautifulSoup(response.content)
Run Code Online (Sandbox Code Playgroud)

......当......时我得到一个稍微不同的错误

response = requests.get( url, headers=headers)
Run Code Online (Sandbox Code Playgroud)

...被叫...

<class 'requests.exceptions.ConnectionError'>: ('Connection aborted.', error(13, 'Permission denied')) 
      args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),) 
      errno = None 
      filename = None 
      message = ProtocolError('Connection aborted.', error(13, 'Permission denied')) 
      request = <PreparedRequest [GET]> 
      response = None 
      strerror = None
Run Code Online (Sandbox Code Playgroud)

...有趣的是写了这个脚本的命令行版本,它工作正常,看起来像这样......

def output_html ( url ):
        soup = BeautifulSoup(urllib2.urlopen( url ).read())
Run Code Online (Sandbox Code Playgroud)

你觉得很奇怪吗?

更新:这个问题可能已经有了答案:urllib2.HTTPError:HTTP错误403:禁止2个答案

他们没有回答这个问题

Red*_*ket 5

终于想通了......

# grep python /var/log/audit/audit.log | audit2allow -M mypol
# semodule -i mypol.pp
Run Code Online (Sandbox Code Playgroud)

  • 这让我走上了正确的轨道,这对我有很大帮助。谢谢!CentOS 7 上的 SELinux 阻止了来自 .py 文件的 Python 调用 urllib/urllib2/requests,但不是来自 Python 命令行,并且错误消息没有帮助。它让我发疯。 (2认同)