MalformedByteSequenceException:2字节UTF-8序列的无效字节2

Mun*_*ian 18 java xml apache-poi

我有一个包含阿拉伯字符的xml文件.当我尝试解析文件时,会出现Exception,MalformedByteSequenceException:2字节UTF-8序列的无效字节2.我使用POI DOM来解析文档.

日志是,

2012-03-19 11:30:00,433 [ERROR] (com.infomindz.remitglobe.bll.remittance.BlackListBean) - Error 

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 2-byte UTF-8 sequence.

    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(Unknown Source)

    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(Unknown Source)

    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(Unknown Source)

    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(Unknown Source)

    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown Source)

    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)

    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)

    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)

    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)

    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)

    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)

    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)

    at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)

    at com.infomindz.remitglobe.bll.remittance.BlackListBean.updateGeneralBlackListDetail(Unknown Source)

    at com.infomindz.remitglobe.bll.remittance.schedulers.BlackListUpdateScheduler.executeInternal(Unknown Source)

    at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)

    at org.quartz.core.JobRunShell.run(JobRunShell.java:216)

    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
Run Code Online (Sandbox Code Playgroud)

例外情况只出现在Windows机器上,而不是出现在Linux机器上.如何解决这个问题.任何建议都应该引人注意.

Mun*_*ian 14

我通过使用UTF8格式创建XML文件来解决问题.

OutputStreamWriter bufferedWriter = new OutputStreamWriter(filePath +
                        System.getProperty("file.separator") + fileName), "UTF8");
Run Code Online (Sandbox Code Playgroud)

使用上面的代码创建文件后,解决了编码问题.感谢每一个,把努力放在这里.

  • 这个解决方案对我有用,但我不得不做一点改动:OutputStream os = new FileOutputStream(file); 和OutputStreamWriter bufferedWriter = new OutputStreamWriter(os,"UTF8"); (3认同)

小智 10

你可以在你的jvm中添加一个jvm参数-Dfile.encoding = utf-8 .