tess4j 与 Spring mvc

1 ocr spring tesseract spring-mvc jai

我曾尝试将 tess4j 作为一个独立的 Java 程序,它可以正常提供文本输出。

现在我正在尝试创建一个 spring mvc web 项目,在 pom 中添加 tess4j 的依赖项,并且我在我的项目中添加了 tess4j 源。

File imageFile = new File("D:/Data/jars/tess/eurotext.tif");    
Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
        // Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
        try {
            result = instance.doOCR(imageFile);
            System.out.println(result);
        } catch (TesseractException e) {
            System.err.println(e.getMessage());
        }
Run Code Online (Sandbox Code Playgroud)

当我尝试在项目中运行一个独立的 java 程序时,上面的代码可以正常工作。所以很明显 jar 文件被添加到正确的构建路径中。

但是当我在控制器映射或服务中调用代码时,它会引发运行时异常。

    SEVERE: Unsupported image format. May need to install JAI Image I/O package.
https://java.net/projects/jai-imageio/
java.lang.RuntimeException: Unsupported image format. May need to install JAI Image I/O package.
https://java.net/projects/jai-imageio/
    at net.sourceforge.vietocr.ImageIOHelper.getIIOImageList(ImageIOHelper.java:324)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:173)
    at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:158)
    at com.ocr.tesseract.TesseractExample.getTextFromImage(TesseractExample.java:27)
    at com.cogz.tp.controller.HomeController.view(HomeController.java:51)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:214)
    at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:132)
    at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:748)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:689)
    at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:83)
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:945)
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:876)
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:931)
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:822)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
    at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:807)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:108)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
    at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:409)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1044)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:662)
java.lang.RuntimeException: Unsupported image format. May need to install JAI Image I/O package.
https://java.net/projects/jai-imageio/
Run Code Online (Sandbox Code Playgroud)

请让我知道缺少什么。提前致谢。

Dar*_*rse 5

甚至我也面临着使用tess4jfor的类似问题DynamicWebProject。但是感谢@nguyenq 的评论帮助我让它工作。大多数 tess4j 使用 TIFF 处理程序进行光学识别。它所需的依赖项在默认 ImageIO 中不可用。因此,需要 jai-imageio.jar。我所做的只是ImageIO.scanForPlugins()在调用执行doOCR. 我的库中有以下罐子:-

tess4j.jar

jai_imageio.jar

ghost4j-0.3.1.jar

jar文件

junit-4.10.jar

这是示例代码:

TessractOCR tessocr = new TessractOCR();
        ImageIO.scanForPlugins();
        String extractedString = tessocr.extractTextFromImage(binarizrImage);
Run Code Online (Sandbox Code Playgroud)

功能

public static String extractTextFromImage(BufferedImage image){
        RenderedImage img = image;

        String result =null;
        try {
            File outputfile = new File("saved.png");
       ImageIO.write(img, "png", outputfile);
        Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
        instance.setDatapath("E:\\OCR-data\\Tess4J-1.2-src\\Tess4J");

        result = instance.doOCR(outputfile);


            System.out.println(result);

        } catch (Exception e) {
            System.err.println(e.getMessage());
        }
        return result;
    }
Run Code Online (Sandbox Code Playgroud)

它 100% 有效:)