对于文本文件的每一行，将分号后面的第一个字符设为大写

Question

对于文本文件的每一行，将分号后面的第一个字符设为大写

我有一个这样的文本文件

John Doe;john Doe is ...;he lives in ...
Mike Nelson;mike Nelson works for ...;he makes ...
Marcy William;marcy's mother is ...;marcy travels a lot...

Run Code Online (Sandbox Code Playgroud)

我想将分号后面的每个字符都转换为大写，所以最终结果是

John Doe;John Doe is ...;He lives in ...
Mike Nelson;Mike Nelson works for ...;He makes ...
Marcy William;Marcy's mother is ...;Marcy travels a lot...

Run Code Online (Sandbox Code Playgroud)

保持其余的完好无损。

此文件包含带重音的字母并以 UTF-8 编码。

Answer 1

Qua*_*odo 10

GNU Sed：

sed 's/;[[:lower:]]/\U&/g' file

Run Code Online (Sandbox Code Playgroud)

对于分号 ( ;[[:lower:]])后面的每个小写字符，我们使用\U特殊序列将其设为大写。该g标志替换一行中的所有出现。

如果 GNU Sed 不可用，则符合 POSIX 的替代方案是使用Ex。

printf '%s\n' '%s/;[[:lower:]]/\U&/g' '%p' | ex file

Run Code Online (Sandbox Code Playgroud)

替代命令是相同的，但所有行都应以%. %p打印输出。如果你想直接修改文件，替换%p为x.

Answer 2

use*_*777 8

不是 awk 或 sed，而是 perl：

perl -C -pe 's/;(.)/;\u$1/g'

Run Code Online (Sandbox Code Playgroud)

该-C选项根据您的语言环境环境变量（LC_ALL等），根据或关闭 UTF-8 i/o ；如果您希望它无条件地假设 UTF-8 输入和输出，请将其更改为-CSD.

请注意，Unicode 大小写很棘手。这会将ihsan变成Ihsan而不是正确的?hsan（土耳其语名称在 i 上方有一个点，即使是大写）。

Answer 3

ter*_*don 5

这是使用 perl 的一种方法：

perl -C -pe 's/;(.)/";" . uc($1)/eg' file

Run Code Online (Sandbox Code Playgroud)

由于您没有在输入文件中显示任何重音符号，因此我将其用于测试：

perl -C -pe 's/;(.)/";" . uc($1)/eg' file

Run Code Online (Sandbox Code Playgroud)

其中产生：

$ cat file
John Doe;john Doe is ...;he lives in ...
Mike Nelson;mike Nelson works for ...;he makes ...
Émilie du Châtelet;émilie du Châtelet;works for ...;she makes ...
Marcy William;marcy's mother is ...;marcy travels a lot...
???? ????????;????'s brother is ...; ???? likes fish

Run Code Online (Sandbox Code Playgroud)

解释

-C：（man perlrun有关详细信息，请参阅）本质上，这将启用 utf8。
-pe: 逐行读取输入文件并在应用给定的脚本后打印每一行e。

工作发生在替换运算符中，其一般格式为s/old/new/flags. 这意味着它会替补多old用new和flags控制它是如何工作的。在这里，使用的标志是e在替换中启用 perl 代码，g这意味着“适用于该行的所有匹配项”。

在;(.)捕获每一个字符后发现，;并将其保存为$1。然后我们用 a 替换它;，并将字符转换为大写 ( uc($1))。

归档时间：	5 年，2 月前
查看次数：	467 次
最近记录：	4 年，3 月前