Zai*_*mir 1 java exception apache-tika
我最近更新了我现有的tika项目,使用tika 1.13而不是1.10.我唯一做的就是将依赖版本从1.10更改为1.13.该项目成功建成.然而,每当我尝试运行应用程序时,我都会遇到以下异常:
java.lang.RuntimeException: Unable to parse the default media type registry
at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:580)
at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:69)
at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:218)
at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:341)
at org.apache.tika.parser.AutoDetectParser.<init>(AutoDetectParser.java:51)
at com.app.tikamanager.MetaParser.<init>(MetaParser.java:54)
at com.app.services.MyService.HandleItemInThread(IntelligentDocumentsService.java:260)
at com.app.intelligentservicebase.ItemHandlerThread.run(ItemHandlerThread.java:41)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tika.mime.MimeTypeException: Invalid type configuration
at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:126)
at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:64)
at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:93)
at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:170)
at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:577)
... 10 more
Caused by: org.xml.sax.SAXNotRecognizedException: http://javax.xml.XMLConstants/feature/secure-processing
at org.apache.xerces.parsers.AbstractSAXParser.setFeature(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl.setFeatures(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl.<init>(Unknown Source)
at org.apache.xerces.jaxp.SAXParserFactoryImpl.newSAXParserImpl(Unknown Source)
at org.apache.xerces.jaxp.SAXParserFactoryImpl.setFeature(Unknown Source)
at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:119)
... 14 more
Run Code Online (Sandbox Code Playgroud)
从我的MetaParser类的构造函数抛出异常,唯一的事情是初始化AutoDetectParser:
private final AutoDetectParser _tikaExtractor;
public MetaParser()
{
_tikaExtractor = new AutoDetectParser();
}
Run Code Online (Sandbox Code Playgroud)
我正在使用Oracle JDK 1.8.0_91-b14在Ubuntu 14.04上运行该应用程序.
我在网上查了一下,这个例外被提了几次,一旦可能的修复是安装OpenJDK但是那个旧版本的Tika,并且由于旧版本曾经在同一个JDK上工作正常我不认为那是问题.
在调用AutoDetectParser构造函数之前是否需要执行或初始化?
将评论提升为答案 - 您的类路径上有一个非常旧版本的Xerces.您的JVM正在选择它作为默认的XML Parser,所以当Tika说"Hi JVM,我能否拥有安全的XML Parser"时它会失败.
(Tika在1.10到1.13期间对XML解析如何完成进行了改进,包括设置更安全的默认值,这就是为什么这已经开始发生)
您需要删除旧的Xerces jar,以便开始使用JVM提供的XML Parser,或者用更新的Xerces版本替换它们
你也可以在Java 8中找到错误解组XML中的一些建议"安全处理org.xml.sax.SAXNotRecognizedException"很有帮助,特别是如果你正在努力在你的构建中找到讨厌的旧Xerces jar!
| 归档时间: |
|
| 查看次数: |
1527 次 |
| 最近记录: |