小编Jor*_*les的帖子

使用Python中的Cloud Vision API格式化OCR文本注释

我正在使用我正在使用的小程序上使用Google Cloud Vision API for Python.该功能正在运行,我得到了OCR结果,但我需要格式化这些结果才能使用它们.

这是功能:

# Call to OCR API
def detect_text_uri(uri):
    """Detects text in the file located in Google Cloud Storage or on the Web.
    """
    client = vision.ImageAnnotatorClient()
    image = types.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations

    for text in texts:
        textdescription = ("    "+ text.description )
        return textdescription

Run Code Online (Sandbox Code Playgroud)

我特别需要通过切片行的文本行,并在开始和结束一个换行符添加四个空格,但在这一刻,这是唯一的工作在第一线,剩下的就是返回一行斑点.

我一直在检查官方文档,但没有真正了解API的响应格式.

python google-cloud-platform google-cloud-vision

Jor*_*les

lucky-day

4
推荐指数

1
解决办法

1553
查看次数

在python中获取正则表达式以返回匹配的模式（不是匹配对象）

所以我在几个字符串行中的其他内容中有一个电子邮件列表。我希望我的代码仅返回匹配的模式，函数如下：

def match_separator(s):
    mail = s.lower()
    mail = re.match(r"[^@]+@[a-z0-9]+(\.[a-z0-9]+){1,2}",s)
    print(mail)

Run Code Online (Sandbox Code Playgroud)

看起来它正确地找到了电子邮件，但它返回的结果对我的后续步骤毫无用处：

def match_separator(s):
    mail = s.lower()
    mail = re.match(r"[^@]+@[a-z0-9]+(\.[a-z0-9]+){1,2}",s)
    print(mail)

Run Code Online (Sandbox Code Playgroud)

我无法用这个输出做任何事情。我根据我从类似文档中理解的内容尝试了几件事print(mail.group(0))，但唯一让我感兴趣的是：

AttributeError: 'NoneType' object has no attribute 'group'

Run Code Online (Sandbox Code Playgroud)

关于如何做到这一点有什么想法吗？看起来在正则表达式中获取匹配的模式应该非常简单（这是大多数用例所追求的，对吧？）但我在这里。

编辑好的，谢谢大家，我很迟钝，这就是原因：

传递给函数的第一行没有匹配项，因此程序以异常结束。此更改解决了我的问题：

def match_separator(s):
    mail = s.lower()
    mail = re.match(r"[^@]+@[a-z0-9]+(\.[a-z0-9]+){1,2}",s)
    try:
        print(mail.group())
    except AttributeError:
        pass

Run Code Online (Sandbox Code Playgroud)

这会绕过不匹配的第一行，只返回我想要的内容。

python regex

Jor*_*les

2021 12-29

1
推荐指数

1
解决办法

1159
查看次数