Cloud Natural Language API返回socket.gaierror:nodename或servname在不时地执行Sentiment Analysis之后提供

imp*_*rme 7 python-3.x python-requests google-cloud-platform

我在Jupyter笔记本上运行代码,我修改了这个链接的代码,所以它从Jupyter笔记本而不是控制台获取它,并迭代一个文件列表.

"""Demonstrates how to make a simple call to the Natural Language API."""

import argparse
import requests
from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


def print_result(annotations, movie_review_filename):


    score = annotations.document_sentiment.score
    magnitude = annotations.document_sentiment.magnitude


    file_path_split = movie_review_filename.split("/")
    fileName = file_path_split[len(file_path_split) - 1][:-4]

    sentencelist = []  
    statuslist = []

    for index, sentence in enumerate(annotations.sentences):
        sentence_sentiment = sentence.sentiment.score
        singlesentence = [fileName, sentence.text.content, sentence.sentiment.magnitude, sentence_sentiment]
        sentencelist.append(singlesentence)


    outputdf = pd.DataFrame(sentencelist, columns = ['status_id', 'sentence', 'sentence_magnitude', 'sentence_sentiment'])        

    outputdf.to_csv("/Users/abhi/Desktop/RetrySentenceCSVs/" + fileName + ".csv", index = False)

    return 0


def analyze(movie_review_filename):
    """Run a sentiment analysis request on text within a passed filename."""
    client = language.LanguageServiceClient()

    with open(movie_review_filename, 'r') as review_file:
        # Instantiates a plain text document.
        content = review_file.read()

    document = types.Document(
        content=content,
        type=enums.Document.Type.PLAIN_TEXT)
    annotations = client.analyze_sentiment(document=document)

    # Print the results
    print_result(annotations, movie_review_filename)


if __name__ == '__main__':

    import glob
    csv_file_list = glob.glob("/Users/abhi/Desktop/mytxtfilepath/*.txt")
    for file in csv_file_list: #Iterate through a list of file paths

        analyze(file)
Run Code Online (Sandbox Code Playgroud)

代码运行正常,10%的文本文件(我有687),但一段时间后它开始抛出错误:

ERROR:root:AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x113b76588>" raised exception!
Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/anaconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
    conn.connect()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 314, in connect
    conn = self._new_conn()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 445, in send
    timeout=timeout
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/anaconda3/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/requests.py", line 120, in __call__
    **kwargs)
  File "/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 513, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/grpc/_plugin_wrapping.py", line 77, in __call__
    callback_state, callback))
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/grpc.py", line 77, in __call__
    callback(self._get_authorization_headers(context), None)
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/grpc.py", line 65, in _get_authorization_headers
    headers)
  File "/anaconda3/lib/python3.6/site-packages/google/auth/credentials.py", line 122, in before_request
    self.refresh(request)
  File "/anaconda3/lib/python3.6/site-packages/google/oauth2/service_account.py", line 322, in refresh
    request, self._token_uri, assertion)
  File "/anaconda3/lib/python3.6/site-packages/google/oauth2/_client.py", line 145, in jwt_grant
    response_data = _token_endpoint_request(request, token_uri, body)
  File "/anaconda3/lib/python3.6/site-packages/google/oauth2/_client.py", line 106, in _token_endpoint_request
    method='POST', url=token_uri, headers=headers, body=body)
  File "/anaconda3/lib/python3.6/site-packages/google/auth/transport/requests.py", line 124, in __call__
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.TransportError: HTTPSConnectionPool(host='accounts.google.com', port=443): Max retries exceeded with url: /o/oauth2/token (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x113b840b8>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))
ERROR:root:AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x113b76588>" raised exception!
Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 171, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 56, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/anaconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
    conn.connect()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 314, in connect
    conn = self._new_conn()
  File "/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 180, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x113b84470>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:
...
Run Code Online (Sandbox Code Playgroud)

错误重复,然后在文件上运行SentimentAnalysis,然后多次显示,然后在文件上运行SentimentAnalysis然后最终停止并RendezVous出现错误(忘记捕获此消息)我想知道的是,怎么样它代码对某些文件集工作了一段时间并抛出错误消息,工作多一点,抛出错误消息然后在一点之后完全停止工作?

我重新编写代码,但发现它在文件夹中的一些随机数量的文件后返回socket.gaierror.因此,人们可以放心地看到文件内容不是问题.

EDIT1:文件只是.txt包含文字的任何文件.有人可以帮我解决这个问题吗?我还可以向您保证,我在所有680个文件中的所有文本总共有1400个请求,我根据Cloud Natural API根据请求的定义进行了非常细致的计算.所以我在我的极限之内.

编辑2:我已经尝试sleep(10)了一段时间似乎工作正常,但再次开始抛出错误..

imp*_*rme 3

我想到了。您不必一次读取全部 600 个文件,而是尝试分批读取 50 个文件。(创建 12 个文件夹,每个文件夹包含 50 个文件),并在每次扫描完文件夹后手动运行代码。我不确定为什么这看起来有效,但它确实有效。