在Java中从Internet读取UTF-8编码的文本文件

Thi*_*ult 2 java utf-8

我想从互联网上读取一个xml文件.你可以在这里找到它.
问题是它是用UTF-8编码的,我需要将它存储到一个文件中,以便以后解析它.我已经阅读了很多关于这方面的主题,这是我想出的:

BufferedReader in;
String readLine;
try
{
    in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));
    BufferedWriter out = new BufferedWriter(new FileWriter(file));

    while ((readLine = in.readLine()) != null)
        out.write(readLine+"\n");

    out.close();
}

catch (UnsupportedEncodingException e)
{
    e.printStackTrace();
}

catch (IOException e)
{
    e.printStackTrace();
}
Run Code Online (Sandbox Code Playgroud)

这段代码一直工作到这一行:<title>Chérie FM</title>
当我调试时,我得到这个:<title>Ch?rie FM</title>

显然,有些东西我无法理解,但在我看来,我在几个网站上都遵循了代码.

Mau*_*res 8

这个文件没有编码为UTF-8,它是ISO-8859-1.

通过将您的代码更改为:

BufferedReader in;
String readLine;
try
{
    in = new BufferedReader(new InputStreamReader(url.openStream(), "ISO-8859-1"));
    BufferedWriter out = new BufferedWriter(new OutputStreamWriter( new FileOutputStream(file) , "UTF-8"));

    while ((readLine = in.readLine()) != null)
        out.write(readLine+"\n");
    out.flush();
    out.close();
}

catch (UnsupportedEncodingException e)
{
    e.printStackTrace();
}

catch (IOException e)
{
    e.printStackTrace();
}
Run Code Online (Sandbox Code Playgroud)

你应该有预期的结果.