DataInputStream和UTF-8

jor*_*oya 3 java utf-8 character-encoding datainputstream

我是一个新的程序员,我正在处理的代码有几个问题.

基本上代码所做的是从另一个JSP接收表单,读取字节,解析数据,并使用DataInputStream将结果提交给SalesForce.

   //Getting the parameters from request
 String contentType = request.getContentType();
 DataInputStream in = new DataInputStream(request.getInputStream());
 int formDataLength = request.getContentLength();

 //System.out.println(formDataLength);
 byte dataBytes[] = new byte[formDataLength];
 int byteRead = 0;
 int totalBytesRead = 0;
 while (totalBytesRead < formDataLength) 
 {
  byteRead = in.read(dataBytes, totalBytesRead, formDataLength);
  totalBytesRead += byteRead;
 }
Run Code Online (Sandbox Code Playgroud)

它工作正常,但只有代码处理正常字符.每当它试图处理特殊字符(如法语字符:àâäæçéèêëîïôùûü)时,我会得到以下乱码:

ÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃ

我知道它可能是DataInputStream的问题,以及它如何不返回UTF-8编码的文本.你们对如何解决这个问题提出任何建议吗?

所有.jsp文件都包含<%@ page pageEncoding ="UTF-8"contentType ="text/html; charset = UTF-8"%>并且Tomcat的设置很好(URI = UTF-8等).我尝试添加:

request.setCharacterEncoding("UTF-8");

response.setCharacterEncoding("UTF-8");

无济于事.

以下是解析数据的示例:

    //Getting the notes for the Case 
 String notes = new String(dataBytes);
 System.out.println(notes);
 String savenotes = casetype.substring(notes.indexOf("notes"));
 //savenotes = savenotes.substring(savenotes.indexOf("\n"), savenotes.indexOf("---"));
 savenotes = savenotes.substring(savenotes.indexOf("\n")+1);
 savenotes = savenotes.substring(savenotes.indexOf("\n")+1);
 savenotes = savenotes.substring(0,savenotes.indexOf("name=\"datafile"));
 savenotes = savenotes.substring(0,savenotes.lastIndexOf("\n------"));
 savenotes = savenotes.trim();
Run Code Online (Sandbox Code Playgroud)

提前致谢.

Bal*_*usC 7

问题不在输入流中,因为它们不处理字符,而只处理字节.您的问题是将这些字节转换为字符.在这种特殊情况下,您需要在String构造函数中指定正确的编码.

String notes = new String(dataBytes, "UTF-8");
Run Code Online (Sandbox Code Playgroud)

也可以看看:


顺便说一句,DataInputStream特定代码段中没有其他值.你可以保留它InputStream.