一行内的差异

Question

一行内的差异

use*_*394 145 command-line diff

我有一些 sql 转储，我正在查看它们之间的差异。diff显然可以向我展示两行之间的差异，但我让自己发疯，试图找到一长串逗号分隔值中的哪些值实际上是导致行不同的值。

我可以使用什么工具来指出某些文件中两行之间的确切字符差异？

Answer 1

ale*_*lex 114

有wdiff，这个词的差异。

在桌面上，meld可以为您突出显示一行中的差异。

对于颜色，安装 [colordiff](http://www.colordiff.org/)，然后执行：`wdiff ab | 颜色差异` (61认同)
彩色 wdiff：`wdiff -w "$(tput bold;tput setaf 1)" -x "$(tput sgr0)" -y "$(tput bold;tput setaf 2)" -z "$(tput sgr0)" file1文件2` (12认同)
`wdiff -nab | colordiff`，建议`man colordiff`。 (5认同)
还有 `dwdiff` 工具，它主要与 `wdiff` 兼容，但也支持彩色输出和一些其他功能。它在一些 Linux 发行版（如 Arch）中更可用。 (2认同)

Answer 2

小智 54

使用 git-diff 的另一种方法：

git diff -U0 --word-diff --no-index -- foo bar | grep -v ^@@

Run Code Online (Sandbox Code Playgroud)

grep -v 如果对差异的位置不感兴趣。

这正是我试图模仿的行为——没有意识到我可以使用 git-diff 而没有一个文件被索引。 (4认同)
--word-diff 是这里的关键选项。谢谢！ (3认同)
一个缺点是 git diff [不适用于进程替换](/sf/ask/1589470011/)。`wdiff <(cmd1) <(cmd2)` 可以工作，但 git 只会比较管道 ID。 (3认同)
--no-index 仅当您位于 git 工作目录中并且 foo 和 bar 也是如此时才需要。 (2认同)
好处是 git diff 比 wdiff 强大得多。例如，您可以使用“--word-diff-regex=.”进行字符比较。 (2认同)

Answer 3

Mar*_*try 25

我用过vimdiff这个。

这是一个屏幕截图（不是我的），显示了一个或两个细微的字符差异，非常突出。还有一个快速教程。

尽管我非常喜欢 vim 和 vimdiff，但 vimdiff 用于突出显示行中差异的算法非常基础。它似乎只是去掉了常见的前缀和后缀，并将它们之间的所有内容突出显示为不同。如果所有更改的字符都组合在一起，则此方法有效，但如果将它们分散开，则效果不佳。这对于自动换行的文本也很糟糕。 (3认同)

Answer 4

Pet*_*r.O 8

这是一个“..咬你的狗的头发”的方法......
diff让你走到了这一步；用它带你走得更远……

这是使用示例行对的输出...?表示一个制表符

Paris in the     spring 
Paris in the the spring 
             vvvv      ^

A ca t on a hot tin roof.
a cant on a hot  in roof 
?   v           ^       ^

the quikc brown box jupps ober the laze dogs 
The?qui ckbrown fox jumps over the lazy dogs 
?  ?   ^ ?      ?     ?    ?          ?     ^

Run Code Online (Sandbox Code Playgroud)

这是脚本..你只需要以某种方式找出线对..（我在今天之前只使用过一次（两次？） diff，所以我不知道它的很多选项，并为此整理了选项脚本对我来说已经足够了，一天:) .. 我认为它必须足够简单，但我应该喝咖啡休息时间......

#
# Name: hair-of-the-diff
# Note: This script hasn't been extensively tested, so beware the alpha bug :) 
#   
# Brief: Uses 'diff' to identify the differences between two lines of text
#        $1 is a filename of a file which contains line pairs to be processed
#
#        If $1 is null "", then the sample pairs are processed (see below: Paris in the spring 
#          
# ? = changed character
# ^ = exists if first line, but not in second 
# v = exists if second line, but not in first

bname="$(basename "$0")"
workd="/tmp/$USER/$bname"; [[ ! -d "$workd" ]] && mkdir -p "$workd"

# Use $1 as the input file-name, else use this Test-data
# Note: this test loop expands \t \n etc ...(my editor auto converts \t to spaces) 
if [[ "$1" == '' ]] ;then
  ifile="$workd/ifile"
{ while IFS= read -r line ;do echo -e "$line" ;done <<EOF
Paris in the spring 
Paris in the the spring
A cat on a hot tin roof.
a cant on a hot in roof
the quikc brown box jupps ober the laze dogs 
The\tquickbrown fox jumps over the lazy dogs
EOF
} >"$ifile"
else
  ifile="$1"
fi
#
[[ -f "$ifile" ]] || { echo "ERROR: Input file NOT found:" ;echo "$ifile" ;exit 1 ; }
#  
# Check for balanced pairs of lines
ilct=$(<"$ifile" wc -l)
((ilct%2==0)) || { echo "ERROR: Uneven number of lines ($ilct) in the input." ;exit 2 ; }
#
ifs="$IFS" ;IFS=$'\n' ;set -f
ix=0 ;left=0 ;right=1
while IFS= read -r line ;do
  pair[ix]="$line" ;((ix++))
  if ((ix%2==0)) ;then
    # Change \x20 to \x02 to simplify parsing diff's output,
    #+   then change \x02 back to \x20 for the final output. 
    # Change \x09 to \x01 to simplify parsing diff's output, 
    #+   then change \x01 into ? U+263B (BLACK SMILING FACE) 
    #+   to the keep the final display columns in line. 
    #+   '?' is hopefully unique and obvious enough (otherwise change it) 
    diff --text -yt -W 19  \
         <(echo "${pair[0]}" |sed -e "s/\x09/\x01/g" -e "s/\x20/\x02/g" -e "s/\(.\)/\1\n/g") \
         <(echo "${pair[1]}" |sed -e "s/\x09/\x01/g" -e "s/\x20/\x02/g" -e "s/\(.\)/\1\n/g") \
     |sed -e "s/\x01/?/g" -e "s/\x02/ /g" \
     |sed -e "s/^\(.\) *\x3C$/\1 \x3C  /g" \
     |sed -n "s/\(.\) *\(.\) \(.\)$/\1\2\3/p" \
     >"$workd/out"
     # (gedit "$workd/out" &)
     <"$workd/out" sed -e "s/^\(.\)..$/\1/" |tr -d '\n' ;echo
     <"$workd/out" sed -e "s/^..\(.\)$/\1/" |tr -d '\n' ;echo
     <"$workd/out" sed -e "s/^.\(.\).$/\1/" -e "s/|/?/" -e "s/</^/" -e "s/>/v/" |tr -d '\n' ;echo
    echo
    ((ix=0))
  fi
done <"$ifile"
IFS="$ifs" ;set +f
exit
#

Run Code Online (Sandbox Code Playgroud)

Answer 5

Has*_*own 6

使用@Peter.O 的解决方案作为基础，我重写了它以进行一些更改。

它只打印每一行一次，使用颜色向您显示差异。
它不写入任何临时文件，而是通过管道传输所有内容。
您可以提供两个文件名，它会比较每个文件中的相应行。 ./hairOfTheDiff.sh file1.txt file2.txt
否则，如果您使用原始格式（一个文件，每两行都需要与之前的行进行比较），您现在可以简单地将其输入，不需要存在要读取的文件。看一下demo源码；这可能会打开花式管道的大门，以便不需要使用paste和多个文件描述符的两个单独输入的文件。

没有高亮表示字符在两行中，高亮表示它在第一行，红色表示它在第二行。

颜色可以通过脚本顶部的变量更改，您甚至可以完全放弃颜色，使用普通字符来表达差异。

#!/bin/bash

same='-' #unchanged
up='?' #exists in first line, but not in second 
down='?' #exists in second line, but not in first
reset=''

reset=$'\e[0m'
same=$reset
up=$reset$'\e[1m\e[7m'
down=$reset$'\e[1m\e[7m\e[31m'

timeout=1


if [[ "$1" != '' ]]
then
    paste -d'\n' "$1" "$2" | "$0"
    exit
fi

function demo {
    "$0" <<EOF
Paris in the spring 
Paris in the the spring
A cat on a hot tin roof.
a cant on a hot in roof
the quikc brown box jupps ober the laze dogs 
The quickbrown fox jumps over the lazy dogs
EOF
}

# Change \x20 to \x02 to simplify parsing diff's output,
#+   then change \x02 back to \x20 for the final output. 
# Change \x09 to \x01 to simplify parsing diff's output, 
#+   then change \x01 into ? U+1F143 (Squared Latin Capital Letter T)
function input {
    sed \
        -e "s/\x09/\x01/g" \
        -e "s/\x20/\x02/g" \
        -e "s/\(.\)/\1\n/g"
}
function output {
    sed -n \
        -e "s/\x01/?/g" \
        -e "s/\x02/ /g" \
        -e "s/^\(.\) *\x3C$/\1 \x3C  /g" \
        -e "s/\(.\) *\(.\) \(.\)$/\1\2\3/p"
}

ifs="$IFS"
IFS=$'\n'
demo=true

while IFS= read -t "$timeout" -r a
do
    demo=false
    IFS= read -t "$timeout" -r b
    if [[ $? -ne 0 ]]
    then
        echo 'No corresponding line to compare with' > /dev/stderr
        exit 1
    fi

    diff --text -yt -W 19  \
        <(echo "$a" | input) \
        <(echo "$b" | input) \
    | \
    output | \
    {
        type=''
        buf=''
        while read -r line
        do
            if [[ "${line:1:1}" != "$type" ]]
            then
                if [[ "$type" = '|' ]]
                then
                    type='>'
                    echo -n "$down$buf"
                    buf=''
                fi

                if [[ "${line:1:1}" != "$type" ]]
                then
                    type="${line:1:1}"

                    echo -n "$type" \
                        | sed \
                            -e "s/[<|]/$up/" \
                            -e "s/>/$down/" \
                            -e "s/ /$same/"
                fi
            fi

            case "$type" in
            '|')
                buf="$buf${line:2:1}"
                echo -n "${line:0:1}"
                ;;
            '>')
                echo -n "${line:2:1}"
                ;;
            *)
                echo -n "${line:0:1}"
                ;;
            esac
        done

        if [[ "$type" = '|' ]]
        then
            echo -n "$down$buf"
        fi
    }

    echo -e "$reset"
done

IFS="$ifs"

if $demo
then
    demo
fi

Run Code Online (Sandbox Code Playgroud)

Answer 6

ant*_*ony 5

wdiff实际上是一种非常古老的逐字比较文件的方法。它的工作方式是重新格式化文件，然后使用diff查找差异并将其再次传回。我自己建议添加上下文，以便不是逐字比较，而是将每个单词都包含在其他“上下文”单词中。这允许差异在文件中的常见段落上更好地同步自身，尤其是当文件大部分不同时，只有几个常见单词块。例如，当比较文本是否抄袭或重复使用时。

dwdiff后来从wdiff. 但是 dwdiff在dwfilter. 这是一个伟大的发展——这意味着您可以重新格式化一个文本以匹配另一个文本，然后使用任何逐行图形差异显示器比较它们。例如，将它与“漫反射”图形差异一起使用......

dwfilter file1 file2 diffuse -w

Run Code Online (Sandbox Code Playgroud)

这将重新格式化file1为的格式file2并将其提供给diffuse视觉比较。file2未修改，因此您可以直接在diffuse. 如果要编辑file1，可以添加-r反向重新格式化的文件。尝试一下，你会发现它非常强大！

我更喜欢图形差异（如上所示），diffuse因为它感觉更清晰、更有用。它也是一个独立的 python 程序，这意味着它很容易安装和分发到其他 UNIX 系统。

其他图形差异似乎有很多依赖性，但也可以使用（您选择）。这些包括kdiff3或xxdiff。

Answer 7

小智 5

这是一个简单的单行：

diff -y <(cat a.txt | sed -e 's/,/\n/g') <(cat b.txt | sed -e 's/,/\n/g')

这个想法是用换行符替换逗号（或任何您想要使用的分隔符）sed。diff然后处理剩下的事情。

Answer 8

rfe*_*urg 0

如果我正确地阅读你的问题，我会用它diff -y来做这种事情。

它使得并排比较更容易找到哪些行存在差异。

这并没有突出显示该行内的差异。如果你排的队很长，那么看到差异是很痛苦的。wdiff、git diff --word-diff、vimgit、meld、kbdiff3、tkdiff 都可以执行此操作。 (3认同)

归档时间：	14 年，6 月前
查看次数：	82740 次
最近记录：	4 年，2 月前