Java:比较,标记和解释Java中的HTML文本

We *_*org 11 html java string string-comparison

我正在开发一个Java项目,其中有一个HTML编辑器,用户可以在html编辑器(ckeditor)中输入文本,实际的HTML文本保存在数据库中.

现在,当用户下次再来,并编辑相同的文本时,我想通过比较它与数据库来显示两者之间的差异.

我面临的最重要的问题是,即使任何比较器工具知道Italic的样式已经变为Bold,比较器的输出也strike-throughs就是单词Italic和节目Bold插入代替它.

但这并不能解释实际编辑的意图行动.意图/行动是用户从Italic到Bold.我正在寻找的是一个工具,它不是显示Italic这个词被删除而Bold被添加而不是代替那个,它会向我显示Italic首先是删除的Bold 单词/句子以及用单词/句子替换.

我希望我的意思很明确.我一直在努力实现这一目标.我试过diff_match_patch,daisydiff等,没有任何帮助.

我的试验:

/*

            String oldTextHtml = mnotes1.getMnotetext();
            String newTextHTML = mnotes.getMnotetext();


            oldTextHtml = oldTextHtml.replace("<br>","\n");
            oldTextHtml = Jsoup.clean(oldTextHtml, Whitelist.basic());
           oldTextHtml = Jsoup.parse(oldTextHtml).text();

            newTextHTML = newTextHTML.replace("<br>","\n");
            newTextHTML = Jsoup.clean(newTextHTML,Whitelist.basic());
            newTextHTML = Jsoup.parse(newTextHTML).text();


            diff_match_patch diffMatchPatch = new diff_match_patch();
            LinkedList<diff_match_patch.Diff> deltas = diffMatchPatch.diff_main(oldTextHtml, newTextHTML);
            diffMatchPatch.diff_cleanupSemantic(deltas);
            newText += diffMatchPatch.diff_prettyHtml(deltas);
            groupNoteHistory.setWhatHasChanged("textchange");
            groupNoteHistory.setNewNoteText(newText);
            noEdit = true;
*/


           List<String> oldTextList = Arrays.asList(mnotes1.getMnotetext().split("(\\.|\\n)"));
            List<String> newTextList = Arrays.asList(mnotes.getMnotetext().split("(\\.|\\n)"));
            if (oldTextList.size() == newTextList.size()) {

                for (int current = 0; current < oldTextList.size(); current++) {
                    if (isLineDifferent(oldTextList.get(current), newTextList.get(current))) {
                        noEdit = true;
                        diff_match_patch diffMatchPatch = new diff_match_patch();
                        LinkedList<diff_match_patch.Diff> deltas = diffMatchPatch.diff_main(oldTextList.get(current), newTextList.get(current));
                        diffMatchPatch.diff_cleanupSemantic(deltas);
                        newText += diffMatchPatch.diff_prettyHtml(deltas);
                        groupNoteHistory.setWhatHasChanged("textchange");
                        groupNoteHistory.setNewNoteText(newText);
                    }
                }
            } else {
                if (!(mnotes.getMnotetext().equals(mnotes1.getMnotetext()))) {
                    if (isLineDifferent(mnotes1.getMnotetext(), mnotes.getMnotetext())) {
                        diff_match_patch diffMatchPatch = new diff_match_patch();

                        LinkedList<diff_match_patch.Diff> deltas = diffMatchPatch.diff_main(mnotes1.getMnotetext(),
                                mnotes.getMnotetext());
                        diffMatchPatch.diff_cleanupSemantic(deltas);
                        newText += diffMatchPatch.diff_prettyHtml(deltas);
                        groupNoteHistory.setWhatHasChanged("textchange");
                        noEdit = true;
                    }
                    groupNoteHistory.setNewNoteText(newText);
                    groupNoteHistory.setWhatHasChanged("textchange");
                }
            }
Run Code Online (Sandbox Code Playgroud)

如果有人知道我怎么能做到这一点,请告诉我.非常感谢.:-)

编辑

我被要求提供一张照片.说明然后是图像.

Old text : <style= bold>Hello</style>
new Text : <style = Italic>Hello</style>
Run Code Online (Sandbox Code Playgroud)

差异输出预期:

如在此图像中.

Fra*_*dez 3

最近,我对一个开源库进行了概念探索,该库在 java 上实现了 diff 命令以及许多其他功能。

\n\n

基本上我比较了两个java文件并获取它们之间修改的行,有了这些信息我认为很容易实现你想要的。

\n\n

src/test/resources/files基本上我的文件夹下有两个java文件

\n\n

文件1

\n\n
package com.onuba.car.javadiff;\n\nimport difflib.Chunk;\nimport difflib.Delta;\nimport difflib.DiffUtils;\nimport difflib.Patch;\n\nimport java.io.BufferedReader;\nimport java.io.File;\nimport java.io.FileReader;\nimport java.io.IOException;\nimport java.util.ArrayList;\nimport java.util.List;\n\npublic class FileComparator {\n\n    private final File original;\n\n    private final File revised;\n\n    public FileComparator(File original, File revised) {\n        this.original = original;\n        this.revised = revised;\n    }\n\n    public List<Chunk> getChangesFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.CHANGE);\n    }\n\n    public List<Chunk> getInsertsFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.INSERT);\n    }\n\n    public List<Chunk> getDeletesFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.DELETE);\n    }\n\n    private List<Chunk> getChunksByType(Delta.TYPE type) throws IOException {\n        final List<Chunk> listOfChanges = new ArrayList<Chunk>();\n        final List<Delta> deltas = getDeltas();\n        for (Delta delta : deltas) {\n            if (delta.getType() == type) {\n                listOfChanges.add(delta.getRevised());\n            }\n        }\n        return listOfChanges;\n    }\n\n    private List<Delta> getDeltas() throws IOException {\n\n        final List<String> originalFileLines = fileToLines(original);\n        final List<String> revisedFileLines = fileToLines(revised);\n\n        final Patch patch = DiffUtils.diff(originalFileLines, revisedFileLines);\n\n        return patch.getDeltas();\n    }\n\n    private List<String> fileToLines(File file) throws IOException {\n        final List<String> lines = new ArrayList<String>();\n        String line;\n        final BufferedReader in = new BufferedReader(new FileReader(file));\n        while ((line = in.readLine()) != null) {\n            lines.add(line);\n        }\n\n        return lines;\n    }\n\n    <style= bold>Hello</style>\n\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

文件2

\n\n
package com.onuba.car.javadiff;\n\nimport difflib.Chunk;\nimport difflib.Delta;\nimport difflib.DiffUtils;\nimport difflib.Patch;\n\nimport java.io.BufferedReader;\nimport java.io.File;\nimport java.io.FileReader;\nimport java.io.IOException;\nimport java.util.ArrayList;\nimport java.util.List;\n\npublic class FileComparator {\n\n    private final File original;\n\n    private final File revised;\n\n    public FileComparator(File original, File revised) {\n        this.original = original;\n        this.revised = revised;\n    }\n\n    public List<Chunk> getChangesFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.CHANGE);\n    }\n\n    public List<Chunk> getInsertsFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.INSERT);\n    }\n\n    public List<Chunk> getDeletesFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.DELETE);\n    }\n\n    private List<Chunk> getChunksByType(Delta.TYPE type) throws IOException {\n        final List<Chunk> listOfChanges = new ArrayList<Chunk>();\n        final List<Delta> deltas = getDeltas();\n        for (Delta delta : deltas) {\n            if (delta.getType() == type) {\n                listOfChanges.add(delta.getRevised());\n            }\n        }\n        return listOfChanges;\n    }\n\n    private List<Delta> getDeltas(String nuevoParam) throws IOException {\n\n        final List<String> originalFileLines = fileToLines(original);\n        final List<String> revisedFileLines = fileToLines(revised);\n\n        final Patch patch = DiffUtils.diff(originalFileLines, revisedFileLines);\n\n        return patch.getDeltas();\n    }\n\n    private List<String> fileToLines(File file, String nuevoParam) throws IOException {\n        final List<String> lines = new ArrayList<String>();\n        String line;\n        final BufferedReader in = new BufferedReader(new FileReader(file));\n        while ((line = in.readLine()) != null) {\n            lines.add(line);\n        }\n\n        return lines;\n    }\n\n    <style = Italic>Hello</style>\n\n    private void nuevoMetodoCool(File file) {\n\n    }\n\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

一个简短的 FileComparator 类(记住它是一个 POC :D)

\n\n
package com.onuba.car.javadiff;\n\nimport difflib.Chunk;\nimport difflib.Delta;\nimport difflib.DiffUtils;\nimport difflib.Patch;\n\nimport java.io.BufferedReader;\nimport java.io.File;\nimport java.io.FileReader;\nimport java.io.IOException;\nimport java.util.ArrayList;\nimport java.util.List;\n\npublic class FileComparator {\n\n    private final File original;\n\n    private final File revised;\n\n    public FileComparator(File original, File revised) {\n        this.original = original;\n        this.revised = revised;\n    }\n\n    public List<Chunk> getChangesFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.CHANGE);\n    }\n\n    public List<Chunk> getInsertsFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.INSERT);\n    }\n\n    public List<Chunk> getDeletesFromOriginal() throws IOException {\n        return getChunksByType(Delta.TYPE.DELETE);\n    }\n\n    private List<Chunk> getChunksByType(Delta.TYPE type) throws IOException {\n        final List<Chunk> listOfChanges = new ArrayList<Chunk>();\n        final List<Delta> deltas = getDeltas();\n        for (Delta delta : deltas) {\n            if (delta.getType() == type) {\n                listOfChanges.add(delta.getRevised());\n            }\n        }\n        return listOfChanges;\n    }\n\n    private List<Delta> getDeltas() throws IOException {\n\n        final List<String> originalFileLines = fileToLines(original);\n        final List<String> revisedFileLines = fileToLines(revised);\n\n        final Patch patch = DiffUtils.diff(originalFileLines, revisedFileLines);\n\n        return patch.getDeltas();\n    }\n\n    private List<String> fileToLines(File file) throws IOException {\n        final List<String> lines = new ArrayList<String>();\n        String line;\n        final BufferedReader in = new BufferedReader(new FileReader(file));\n        while ((line = in.readLine()) != null) {\n            lines.add(line);\n        }\n\n        return lines;\n    }\n\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

还有一个 Junit

\n\n
package com.onuba.car.javadiff.test;\n\nimport static org.junit.Assert.fail;\n\nimport java.io.File;\nimport java.io.IOException;\nimport java.util.List;\n\nimport org.junit.Test;\n\nimport com.everis.car.javadiff.FileComparator;\n\nimport difflib.Chunk;\n\npublic class FileComparatorTest {\n\n    private final File original = new File("./src/test/resources/files/FileComparatorv1.java");\n\n    private final File revised = new File("./src/test/resources/files/FileComparatorv2.java");\n\n    @Test\n    public void shouldGetChangesBetweenFiles() {\n\n        final FileComparator comparator = new FileComparator(original, revised);\n\n        try {\n            final List<Chunk> changesFromOriginal = comparator.getChangesFromOriginal();\n\n            final int changeNum = changesFromOriginal.size();\n            System.out.println("Tama\xc3\xb1o de cambios: " + changeNum);\n\n            for (int i = 0; i < changeNum; i++) {\n\n                final Chunk change = changesFromOriginal.get(i);\n                final int firstLineOfFirstChange = change.getPosition() + 1;\n                final int changeSize = change.size();\n                //final String changeText = change.getLines().get(0).toString();\n\n                System.out.println("Cambio n\xc2\xba " + i);\n                System.out.println("firstLineOfFirstChange: " + firstLineOfFirstChange);\n                System.out.println("changeSize: " + changeSize);\n                System.out.println("change text: ");\n                showTest(change.getLines());\n\n            }\n\n            /*assertEquals(3, changesFromOriginal.size());\n\n            final Chunk firstChange = changesFromOriginal.get(0);\n            final int firstLineOfFirstChange = firstChange.getPosition() + 1;\n            final int firstChangeSize = firstChange.size();\n            assertEquals(2, firstLineOfFirstChange);\n            assertEquals(1, firstChangeSize);\n            final String firstChangeText = firstChange.getLines().get(0).toString();\n            assertEquals("Line 3 with changes", firstChangeText);\n\n            final Chunk secondChange = changesFromOriginal.get(1);\n            final int firstLineOfSecondChange = secondChange.getPosition() + 1;\n            final int secondChangeSize = secondChange.size();\n            assertEquals(4, firstLineOfSecondChange);\n            assertEquals(2, secondChangeSize);\n            final String secondChangeFirstLineText = secondChange.getLines().get(0).toString();\n            final String secondChangeSecondLineText = secondChange.getLines().get(1).toString();\n            assertEquals("Line 5 with changes and", secondChangeFirstLineText);\n            assertEquals("a new line", secondChangeSecondLineText);\n\n            final Chunk thirdChange = changesFromOriginal.get(2);\n            final int firstLineOfThirdChange = thirdChange.getPosition() + 1;\n            final int thirdChangeSize = thirdChange.size();\n            assertEquals(11, firstLineOfThirdChange);\n            assertEquals(1, thirdChangeSize);\n            final String thirdChangeText = thirdChange.getLines().get(0).toString();\n            assertEquals("Line 10 with changes", thirdChangeText);*/\n\n        } catch (IOException ioe) {\n            fail("Error running test shouldGetChangesBetweenFiles " + ioe.toString());\n        }\n    }\n\n    @Test\n    public void shouldGetInsertsBetweenFiles() {\n\n        final FileComparator comparator = new FileComparator(original, revised);\n\n        try {\n            final List<Chunk> insertsFromOriginal = comparator.getInsertsFromOriginal();\n\n            final int changeNum = insertsFromOriginal.size();\n            System.out.println("Tama\xc3\xb1o de inserciones: " + changeNum);\n\n            for (int i = 0; i < changeNum; i++) {\n\n                final Chunk change = insertsFromOriginal.get(i);\n                final int firstLineOfFirstChange = change.getPosition() + 1;\n                final int changeSize = change.size();\n                //final String changeText = change.getLines().get(0).toString();\n\n                System.out.println("insercion n\xc2\xba " + i);\n                System.out.println("firstLineOfFirstInsertion: " + firstLineOfFirstChange);\n                System.out.println("insertion Size: " + changeSize);\n                System.out.println("insertion text: ");\n                showTest(change.getLines());\n\n            }\n        } catch (IOException ioe) {\n            fail("Error running test shouldGetInsertsBetweenFiles " + ioe.toString());\n        }\n        /*try {\n            final List<Chunk> insertsFromOriginal = comparator.getInsertsFromOriginal();\n            assertEquals(1, insertsFromOriginal.size());\n\n            final Chunk firstInsert = insertsFromOriginal.get(0);\n            final int firstLineOfFirstInsert = firstInsert.getPosition() + 1;\n            final int firstInsertSize = firstInsert.size();\n            assertEquals(7, firstLineOfFirstInsert);\n            assertEquals(1, firstInsertSize);\n            final String firstInsertText = firstInsert.getLines().get(0).toString();\n            assertEquals("new line 6.1", firstInsertText);\n\n        } catch (IOException ioe) {\n            fail("Error running test shouldGetInsertsBetweenFiles " + ioe.toString());\n        }*/\n    }\n\n    @Test\n    public void shouldGetDeletesBetweenFiles() {\n\n        final FileComparator comparator = new FileComparator(original, revised);\n\n        try {\n            final List<Chunk> deletesFromOriginal = comparator.getDeletesFromOriginal();\n\n            final int changeNum = deletesFromOriginal.size();\n            System.out.println("Tama\xc3\xb1o de deletes: " + changeNum);\n\n            for (int i = 0; i < changeNum; i++) {\n\n                final Chunk change = deletesFromOriginal.get(i);\n                final int firstLineOfFirstChange = change.getPosition() + 1;\n                final int changeSize = change.size();\n                //final String changeText = change.getLines().get(0).toString();\n\n                System.out.println("delete n\xc2\xba " + i);\n                System.out.println("firstLineOfFirstDelete: " + firstLineOfFirstChange);\n                System.out.println("delete Size: " + changeSize);\n                System.out.println("delete text: ");\n                showTest(change.getLines());\n\n            }\n        } catch (IOException ioe) {\n            fail("Error running test shouldGetInsertsBetweenFiles " + ioe.toString());\n        }\n\n        /*try {\n            final List<Chunk> deletesFromOriginal = comparator.getDeletesFromOriginal();\n            assertEquals(1, deletesFromOriginal.size());\n\n            final Chunk firstDelete = deletesFromOriginal.get(0);\n            final int firstLineOfFirstDelete = firstDelete.getPosition() + 1;\n            assertEquals(1, firstLineOfFirstDelete);\n\n        } catch (IOException ioe) {\n            fail("Error running test shouldGetDeletesBetweenFiles " + ioe.toString());\n        }*/\n    }\n\n    private void showTest(List<?> texts) {\n\n        if (texts != null) {\n            for (Object s : texts) {\n                System.out.println(s.toString());\n            }\n        }\n    }\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

最后是我的 pom.xml

\n\n
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\n    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">\n    <modelVersion>4.0.0</modelVersion>\n    <groupId>com.onuba.car</groupId>\n    <artifactId>javadiffpoc</artifactId>\n    <version>1.0.0-SNAPSHOT</version>\n    <packaging>jar</packaging>\n    <name>JavaDiff ::  POC</name>\n\n    <url>http://maven.apache.org</url>\n\n    <dependencies>\n        <dependency>\n            <groupId>junit</groupId>\n            <artifactId>junit</artifactId>\n            <version>4.11</version>\n            <scope>test</scope>\n        </dependency>\n\n        <!-- GUAVA -->\n        <dependency>\n            <groupId>com.google.guava</groupId>\n            <artifactId>guava</artifactId>\n            <version>15.0</version>\n        </dependency>\n\n        <dependency>\n            <groupId>com.googlecode.java-diff-utils</groupId>\n            <artifactId>diffutils</artifactId>\n            <version>1.2.1</version>\n        </dependency>\n\n        <!-- Logger -->\n        <dependency>\n            <groupId>ch.qos.logback</groupId>\n            <artifactId>logback-classic</artifactId>\n            <version>1.0.0</version>\n        </dependency>\n        <dependency>\n            <groupId>ch.qos.logback</groupId>\n            <artifactId>logback-access</artifactId>\n            <version>1.0.0</version>\n        </dependency>\n        <dependency>\n            <groupId>ch.qos.logback</groupId>\n            <artifactId>logback-core</artifactId>\n            <version>1.0.0</version>\n        </dependency>\n        <dependency>\n            <groupId>org.slf4j</groupId>\n            <artifactId>slf4j-api</artifactId>\n            <version>1.6.4</version>\n        </dependency>\n\n    </dependencies>\n\n    <build>\n        <plugins>\n            <plugin>\n                <artifactId>maven-jar-plugin</artifactId>\n                <version>2.4</version>\n            </plugin>\n        </plugins>\n    </build>\n\n    <properties>\n        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>\n    </properties>\n</project>\n
Run Code Online (Sandbox Code Playgroud)\n\n

对于西班牙语中的一些日志和一些小东西感到抱歉:D,也许通过这些你可以实现你想要的。

\n\n

lib 主页: https: //code.google.com/p/java-diff-utils/页面末尾有一个教程链接(西班牙语)

\n\n

希望有帮助!

\n\n

更新

\n\n

我做了一个简单的类,生成一个文件,其差异在于用此代码删除行(我不太了解您所需的格式,如果需要,您可以添加更多装饰器)

\n\n
package com.onuba.car.javadiff;\n\nimport java.io.File;\nimport java.io.IOException;\nimport java.io.PrintWriter;\nimport java.io.RandomAccessFile;\nimport java.util.ArrayList;\nimport java.util.List;\n\nimport difflib.Chunk;\n\npublic class Comparer {\n\n    private final File original = new File("./src/test/resources/files/FileComparatorv1.java");\n\n    private final File revised = new File("./src/test/resources/files/FileComparatorv2.java");\n\n    public static void main(String[] args) {\n\n        final Comparer comparer = new Comparer();\n\n        comparer.createDiffFile();\n    }\n\n    private void createDiffFile() {\n\n        PrintWriter diffFile = null;\n        //RandomAccessFile diffFile = null;\n        RandomAccessFile oldFile = null;\n\n        try {\n\n            //diffFile = new RandomAccessFile(new File("./diffFile_" + System.currentTimeMillis()), "rw");\n            diffFile = new PrintWriter("./diffFile_" + System.currentTimeMillis(), "UTF-8");\n            oldFile = new RandomAccessFile(original, "r");\n\n            final FileComparator comparator = new FileComparator(original, revised);\n\n            final List<Chunk> changesFromOriginal = comparator.getChangesFromOriginal();\n\n            final int changeNum = changesFromOriginal.size();\n            System.out.println("Tama\xc3\xb1o de cambios: " + changeNum);\n\n            final List<Integer> changesIndex = new ArrayList<Integer>();\n\n            for (Chunk change : changesFromOriginal) {\n\n                changesIndex.add(change.getPosition());\n            }\n\n            String line = oldFile.readLine();\n            int lineIndex = 0;\n            while (line != null) {\n\n                if (changesIndex.contains(lineIndex)) {\n\n                    String strikeLine = "From: <strike-through color=yellow>" + line + "</strike-through>"; \n                diffFile.print(strikeLine + " To: <strong>");\n\n                for (Object s : changesFromOriginal.get(changesIndex.indexOf(lineIndex)).getLines()) {\n                    diffFile.println(s.toString());\n                }\n                diffFile.print("</strong>");\n\n                } else {\n\n                    diffFile.println(line);\n                }\n\n                line = oldFile.readLine();\n                lineIndex++;\n            }\n\n        } catch (IOException e) {\n\n        } finally {\n            try {\n                if (diffFile != null) {\n                    diffFile.close();\n                }\n\n                if (oldFile != null) {\n                    oldFile.close();\n                }\n            } catch (IOException