我正在尝试部署 firebase 云功能,但不断收到此错误。最奇怪的部分是,我让它工作正常,但从 firebase 与云视觉对话切换到 firebase 与 google 文档对话。突然,这个错误出现了。我已经尝试了几个不同版本的 firebase 工具和 NodeJS,但尚未能够解决该问题。这是下面的错误。
[2023-02-28T17:29:15.917Z] Building nodejs source
[2023-02-28T17:29:15.922Z] Could not find functions.yaml. Must use http discovery
[2023-02-28T17:29:15.935Z] Found firebase-functions binary at 'C:\Users\crisb\source\repos\Javascriptcouldfunction4\functions\node_modules\.bin\firebase-functions'
[2023-02-28T17:29:17.570Z] Serving at port 8704
[2023-02-28T17:29:19.519Z] Got response from /__/functions.yaml Failed to generate manifest from function source: TypeError [ERR_INVALID_ARG_TYPE]: The "id" argument must be of type string. Received an instance of Object
[2023-02-28T17:29:19.522Z] Failed to parse functions.yamlincomplete explicit mapping pair; a key node is missed; or followed by a …Run Code Online (Sandbox Code Playgroud) 我使用 Document OCR API 从 pdf 文件中提取文本,但部分内容不准确。我发现原因可能是因为一些汉字的存在。
以下是我虚构的示例,其中我裁剪了提取文本错误的部分区域,并添加了一些汉字来重现该问题。

当我使用网站版本时,我无法获取汉字,但其余字符是正确的。

当我使用Python提取文本时,我可以正确地获取汉字,但剩余的部分字符是错误的。

我得到的实际字符串。

网站和API中的Document AI版本是否不同?如何正确获取所有字符?
更新:
当我在打印文本后打印detected_languages(不知道为什么 for lines = page.lines,detected_languagesfor 两行都是空列表,需要更改为page.blocks或page.paragraphs首先)时,我得到以下输出。

代码:
from google.cloud import documentai_v1beta3 as documentai
project_id= 'secret-medium-xxxxxx'
location = 'us' # Format is 'us' or 'eu'
processor_id = 'abcdefg123456' # Create processor in Cloud Console
opts = {}
if location == "eu":
opts = {"api_endpoint": "eu-documentai.googleapis.com"}
client = documentai.DocumentProcessorServiceClient(client_options=opts)
def get_text(doc_element: dict, document: dict):
"""
Document AI …Run Code Online (Sandbox Code Playgroud) python ocr google-api-python-client google-cloud-platform cloud-document-ai
我浪费了几个小时尝试https://cloud.google.com/document-ai/docs/quickstart-client-libraries中的Google Document AI java示例
如果您像这样输入您的项目 ID、位置和处理器 ID
String projectId = "6493xxxxxxxx";
String location = "eu";
String processorId = "74451xxxxxx";
String filePath = "/Users/schube/Desktop/file.pdf";
Run Code Online (Sandbox Code Playgroud)
并运行该示例,您只会得到一个InvalidArgumentException:
Exception in thread "main" com.google.api.gax.rpc.InvalidArgumentException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request contains an invalid argument.
at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1133)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1277)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:463)
at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:427)
at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:460)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:557)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:69)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:738)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:717)
at …Run Code Online (Sandbox Code Playgroud) 我正在使用 Google 的Document AI训练模型。训练失败并出现以下错误(为简单起见,我仅包含 JSON 文件的一部分,但该错误对于数据集中的所有文档都是相同的):
"trainingDatasetValidation": {
"documentErrors": [
{
"code": 3,
"message": "Invalid document.",
"details": [
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"reason": "INVALID_DOCUMENT",
"domain": "documentai.googleapis.com",
"metadata": {
"num_fields": "0",
"num_fields_needed": "1",
"document": "5e88c5e4cc05ddb8.json",
"annotation_name": "INCOME_ADJUSTMENTS",
"field_name": "entities.text_anchor.text_segments"
}
}
]
}
Run Code Online (Sandbox Code Playgroud)
我从这个错误中了解到的是,模型期望该字段INCOME_ADJUSTMENTS在文档中(至少)出现一次,但它发现它的实例为零。
这是可以理解的,除非我已经INCOME_ADJUSTMENTS在模式中将该字段定义为“可选一次”,即该字段可以出现零次或一次。
我错过了什么吗?尽管该错误已在架构中得到解决,但为什么该错误仍然存在?
ps 我还尝试过“可选多个”(以及“必需一次”和“必需多个”),但错误仍然存在。
编辑:根据要求,以下是其中一个 JSON 文件的样子。请注意,此处没有 PII,因为详细信息(姓名、SSN 等)是合成数据。
我正在尝试使用 Node.js 应用程序运行 Google 的文档 OCR。所以我使用了 Node JavaScript 的客户端库@google-cloud/documentai
我做了像文档示例中那样的一切
有我的代码
const projectId = '*******';
const location = 'eu'; // Format is 'us' or 'eu'
const processor = '******'; // Create processor in Cloud Console
const keyFilename = './secret/******.json';
const { DocumentProcessorServiceClient } = require('@google-cloud/documentai').v1beta3;
const client = new DocumentProcessorServiceClient({projectId, keyFilename});
async function start(encodedImage) {
console.log("Google AI Started")
const name = `projects/${projectId}/locations/${location}/processors/${processor}`;
const request = {
name,
document: {
content: encodedImage,
mimeType: 'application/pdf',
},
}
try {
const …Run Code Online (Sandbox Code Playgroud) node.js google-cloud-platform google-ai-platform cloud-document-ai
我正在关注https://codelabs.developers.google.com/codelabs/docai-form-parser-v3-python#7上的教程,我遵循了他们指定的所有步骤......
我按照教程中指定的方式使用 Cloud SDK 进行开发,但随后
他们给出的代码如下:
project_id= 'YOUR_PROJECT_ID'
location = 'YOUR_PROJECT_LOCATION' # Format is 'us' or 'eu'
processor_id = 'YOUR_PROCESSOR_ID' # Create processor in Cloud Console
file_path = 'form.pdf' # The local file in your current working directory
from google.cloud import documentai_v1beta3 as documentai
from google.cloud import storage
def process_document(
project_id=project_id, location=location, processor_id=processor_id, file_path=file_path
):
# Instantiates a client
client = documentai.DocumentProcessorServiceClient()
# The full resource name of the processor, e.g.:
# projects/project-id/locations/location/processor/processor-id
# You must create new processors …Run Code Online (Sandbox Code Playgroud) 我正在使用 Google Cloud Document AI 的发票解析器。API响应是google.cloud.documentai_v1.types.Document对象。我尝试编写以下方法将其转换为 JSON,但没有任何效果:
尝试在 python 中从谷歌云实现文档 OCR 时出现此错误,如下所述:https : //cloud.google.com/document-ai/docs/ocr
当我跑
result = client.process_document(request=request)
Run Code Online (Sandbox Code Playgroud)
我收到这个错误
Traceback (most recent call last):
File "/Users/Niolo/Desktop/untitled/Desktop/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 73, in error_remapped_callable
return callable_(*args, **kwargs)
File "/Users/Niolo/Desktop/untitled/Desktop/lib/python3.8/site-packages/grpc/_channel.py", line 923, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/Users/Niolo/Desktop/untitled/Desktop/lib/python3.8/site-packages/grpc/_channel.py", line 826, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Request contains an invalid argument."
debug_error_string = "{"created":"@1614769280.332675000","description":"Error received from peer ipv4:142.250.180.138:443","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Request contains an invalid argument.","grpc_status":3}"
>
The above exception was the …Run Code Online (Sandbox Code Playgroud)