如何在 groff 中正确显示波兰语变音符号?

atm*_*cre 8 pdf unicode groff

我正在玩groff ,我想从以下内容生成 pdftest.ms

.TL
Tytu?
.AU
Imi? Nazwisko
.NH
Wst?p
.PP
Pierwszy paragraf. Jakie? informacje, ?eby by?y polskie znaki.
.PP
Drugi paragraf. Reszta znaków:

??????ó????????Ó??
.NH
Bla bla bla
.PP
safsdsdfsasdds
Run Code Online (Sandbox Code Playgroud)

如您所见,它包含波兰语变音符号。编译后,groff -ms test.ms -T pdf > test.pdf我们看到了这个烂摊子: 可怕!

我的第一个猜测是使用 utf-8 支持重新编译。

$ groff -Kutf8 -ms test.ms -T pdf > test.pdf
test.ms:4: warning: can't find special character `u0065_0328'
test.ms:8: warning: can't find special character `u0073_0301'
test.ms:8: warning: can't find special character `u00A0'
test.ms:8: warning: can't find special character `u007A_0307'
test.ms:12: warning: can't find special character `u0061_0328'
test.ms:12: warning: can't find special character `u006E_0301'
test.ms:12: warning: can't find special character `u007A_0301'
test.ms:12: warning: can't find special character `u0041_0328'
test.ms:12: warning: can't find special character `u0045_0328'
test.ms:12: warning: can't find special character `u004E_0301'
test.ms:12: warning: can't find special character `u0053_0301'
test.ms:12: warning: can't find special character `u005A_0307'
test.ms:12: warning: can't find special character `u005A_0301'
Run Code Online (Sandbox Code Playgroud)

Groff 只是忽略了大部分符号,pdf 看起来像这样:

还是很惨

经过一番谷歌搜索后,我发现了这一点

groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
dvipdfm -cz 9 test.dvi
Run Code Online (Sandbox Code Playgroud)

是的,它仍然失败(虽然更好,只跳过了一个字符):

$ groff -Kutf8 -Tdvi -mec -ms test.ms > test.dvi
test.ms:8: warning: can't find special character `u00A0'
Run Code Online (Sandbox Code Playgroud)

我怎样才能让它发挥作用?

编辑:这是输出locale

LANG=pl_PL.UTF-8
LANGUAGE=
LC_CTYPE="pl_PL.UTF-8"
LC_NUMERIC="pl_PL.UTF-8"
LC_TIME="pl_PL.UTF-8"
LC_COLLATE="pl_PL.UTF-8"
LC_MONETARY="pl_PL.UTF-8"
LC_MESSAGES="pl_PL.UTF-8"
LC_PAPER="pl_PL.UTF-8"
LC_NAME="pl_PL.UTF-8"
LC_ADDRESS="pl_PL.UTF-8"
LC_TELEPHONE="pl_PL.UTF-8"
LC_MEASUREMENT="pl_PL.UTF-8"
LC_IDENTIFICATION="pl_PL.UTF-8"
LC_ALL=
Run Code Online (Sandbox Code Playgroud)

L. *_*rel 2

性格A0是一个牢不可破的空间。看起来它位于“Jakie\xc5\x9b”和“informacje”之间。使用编辑器将其替换为普通空格,然后就可以开始了。

\n\n

建议:我已经设置了我的编辑器(emacs、vim)来突出显示牢不可破的空格,因为当我在输入需要按的字符后点击时,有时我会无意中用AltGr+键入一些空格。spacespaceAltGr

\n\n

您第一次猜测后的警告似乎表明某些字符(\xc4\x99、\xc5\x9b、\xc5\xbc...)是通过组合变音符号而不是本机进行编码的。例如 \xc4\x99 == e(十六进制 65)+ 组合 ogonek(十六进制 328)而不是“e 与 ogonek”(十六进制 119)。你如何编辑你的源文件?您可以使用 Compose 键来生成带变音符号的“独立”字母,例如Compose e ,“\xc4\x99”。

\n