重构自动检测文件的编码

naz*_*art 8 java refactoring encoding

我需要检查编码文件.这段代码工作但有点长.如何能够重构这个逻辑.也许可以为这个目标使用另一种变体?

码:

class CharsetDetector implements Checker {

    Charset detectCharset(File currentFile, String[] charsets) {
        Charset charset = null;

        for (String charsetName : charsets) {
            charset = detectCharset(currentFile, Charset.forName(charsetName));
            if (charset != null) {
                break;
            }
        }

        return charset;
    }

    private Charset detectCharset(File currentFile, Charset charset) {
        try {
            BufferedInputStream input = new BufferedInputStream(
                    new FileInputStream(currentFile));

            CharsetDecoder decoder = charset.newDecoder();
            decoder.reset();

            byte[] buffer = new byte[512];
            boolean identified = false;
            while ((input.read(buffer) != -1) && (!identified)) {
                identified = identify(buffer, decoder);
            }

            input.close();

            if (identified) {
                return charset;
            } else {
                return null;
            }

        } catch (Exception e) {
            return null;
        }
    }

    private boolean identify(byte[] bytes, CharsetDecoder decoder) {
        try {
            decoder.decode(ByteBuffer.wrap(bytes));
        } catch (CharacterCodingException e) {
            return false;
        }
        return true;
    }

    @Override
    public boolean check(File fileChack) {
        if (charsetDetector(fileChack)) {
            return true;
        }
        return false;
    }

    private boolean charsetDetector(File currentFile) {
        String[] charsetsToBeTested = { "UTF-8", "windows-1253", "ISO-8859-7" };

        CharsetDetector charsetDetector = new CharsetDetector();
        Charset charset = charsetDetector.detectCharset(currentFile,
                charsetsToBeTested);

        if (charset != null) {
            try {
                InputStreamReader reader = new InputStreamReader(
                        new FileInputStream(currentFile), charset);

                @SuppressWarnings("unused")
                int valueReaders = 0;
                while ((valueReaders = reader.read()) != -1) {
                    return true;
                }

                reader.close();
            } catch (FileNotFoundException exc) {
                System.out.println("File not found!");
                exc.printStackTrace();
            } catch (IOException exc) {
                exc.printStackTrace();
            }
        } else {
            System.out.println("Unrecognized charset.");
            return false;
        }

        return true;
    }
}
Run Code Online (Sandbox Code Playgroud)

题:

  • 这个程序逻辑如何重构?
  • 哪种是检测编码的另一种方法(如UTF-16 sequance等)?

rad*_*dai 5

重构此代码的最佳方法是引入第三方库,为您进行字符检测,因为它们可能做得更好,它会使您的代码更小.看到这个问题的几个选择