将第一个括号和第一个问号之间的行部分添加到行尾

Fat*_*iet 1 bash awk for-loop sed

我正在尝试在文件中从每行开头到行尾添加一个部分。目前,该文件的格式如下:

1.1) This is a sample question? Yes it is a sample question
1.2) Are you quite sure it is a sample question? I am quite sure
...
Run Code Online (Sandbox Code Playgroud)

我想做的是将每行开头的问题(但不是数字)添加到行尾,本质上是制作一个如下格式的文件:

1.1) This is a sample question? Yes it is a sample question This is a sample question
1.2) Are you quite sure it is a sample question? I am quite sure Are you quite sure it is a sample question
...
Run Code Online (Sandbox Code Playgroud)

我已经对原始文本文件进行了大量的重组,包括删除除相关问题末尾的问号之外的所有问号以及除每行编号末尾的所有右括号之外的所有问号。

我在这里的理由是使用右括号作为标记来指示要重复的部分的开始位置,并使用问号作为标记来显示要重复的部分的结束位置。然而,在实际尝试实现这一点时,我却一无所获。

我假设我需要使用一个for遍历每一行的循环,当它看到 a 时激活),并将此后的每个空格分隔字符添加到行尾,直到它看到 a ?,此时它停止并移动到下一行;然而,我正在努力在bash.

mar*_*rkp 5

假设:

  • 所有感兴趣的行均采用以下格式:<text>+ )+ <text2copy>+ ?+<more_text
  • 对于感兴趣的行,我们要附加到行尾:<space>+<text2copy>
  • 所有其他线路均应保留

样本数据:

$ cat questions.dat
1.1) This is a sample question? Yes it is a sample question
ignore this line and do nothing to it
1.2) Are you quite sure it is a sample question? I am quite sure
Run Code Online (Sandbox Code Playgroud)

一个想法使用awk

awk '
/).*\?/ { split($0,arr,"[)?]")   # if line contains ")" + <text> + "?" then split
                                 # the line using ")" and "?" as delimiters, placing 
                                 # results into array "arr[]"
          $0 = $0 arr[2]         # append 2nd element of array to end of line
        }
1                                # print current line
' questions.dat
Run Code Online (Sandbox Code Playgroud)

上面生成:

1.1) This is a sample question? Yes it is a sample question This is a sample question
ignore this line and do nothing to it
1.2) Are you quite sure it is a sample question? I am quite sure Are you quite sure it is a sample question
Run Code Online (Sandbox Code Playgroud)

sed使用捕获组的另一个想法:

$ sed -E 's/^[^)]*[)] ([^?]*)[?].*/& \1/' questions.dat
Run Code Online (Sandbox Code Playgroud)

在哪里:

  • -E- 启用扩展正则表达式支持
  • ^[^)]*[)] - 匹配行首 ( ^) + 一些不包括)+ )+的字符<space>
  • ([^?*)- [ 1st capture group] 匹配所有直到但不包括?
  • [?].*- 匹配从?到行尾
  • & \1- 打印我们的正则表达式匹配(在本例中为整行) + <space>+1st capture group

上面生成:

1.1) This is a sample question? Yes it is a sample question This is a sample question
ignore this line and do nothing to it
1.2) Are you quite sure it is a sample question? I am quite sure Are you quite sure it is a sample question
Run Code Online (Sandbox Code Playgroud)