BSD - 从目录中的所有文件中递归删除非 ascii 字符

Dan*_*Dan 6 scripting bash rename bsd freenas

我正在尝试将一堆 (300GB+) 文件从 FAT32 驱动器迁移到我的 freeNas ZFS 文件系统,但是我向它抛出的每个命令 (tar,pax,mv,cp) 在遇到非ASCII 文件名 - 它通常是在 Windows 下创建的,它读取的内容类似于“foo?s bar.mp3...”,其中 ? 可能是撇号之类的。

任何人都可以用几行代码来帮助递归遍历目录树并重命名文件以删除有问题的字符。

非常感激。

Jus*_*tin 7

重命名可以做到这一点..

尝试类似

find dir -depth -exec rename -n 's/[^[:ascii:]]/_/g' {} \; | cat -v
Run Code Online (Sandbox Code Playgroud)

您可能需要 cat -v 来正确显示任何奇怪的字符,而不会让您的终端被搞砸。

如果打印出可接受的替换,请将 -n 更改为 -v。

也就是说,听起来你的文件系统上的字符集是错误的(mount -o utf8 ?),因为这种事情应该真的有效......


Den*_*son 0

尝试挂载文件系统,并将 iocharset 选项设置为它使用的编码。

\n\n

man mount“FAT 的安装选项”部分下:

\n\n
   iocharset=value\n          Character set to use for converting between 8 bit characters and\n          16 bit Unicode characters. The default is iso8859-1.  Long file\xe2\x80\x90\n          names are stored on disk in Unicode format.\n
Run Code Online (Sandbox Code Playgroud)\n\n

另请参阅“vfat 的安装选项”部分:

\n\n
   uni_xlate\n          Translate  unhandled  Unicode  characters  to  special   escaped\n          sequences.   This lets you backup and restore filenames that are\n          created with any Unicode characters. Without this option, a  \'?\'\n          is used when no translation is possible. The escape character is\n          \':\' because it is otherwise illegal on the vfat filesystem.  The\n          escape  sequence  that gets used, where u is the unicode charac\xe2\x80\x90\n          ter, is: \':\', (u & 0x3f), ((u>>6) & 0x3f), (u>>12).\n
Run Code Online (Sandbox Code Playgroud)\n\n

\n\n
   utf8   UTF8  is  the  filesystem safe 8-bit encoding of Unicode that is\n          used by the console. It can be be  enabled  for  the  filesystem\n          with this option or disabled with utf8=0, utf8=no or utf8=false.\n          If `uni_xlate\' gets set, UTF8 gets disabled.\n
Run Code Online (Sandbox Code Playgroud)\n\n

编辑:

\n\n

抱歉,那是 Linux,这是 BSD(来自man mount_msdosfs

\n\n
 -L locale\n     Specify locale name used for file name conversions for DOS and\n     Win\'95 names.  By default ISO 8859-1 assumed as local character\n     set.\n\n -D DOS_codepage\n     Specify the MS-DOS code page (aka IBM/OEM code page) name used\n     for file name conversions for DOS names.\n
Run Code Online (Sandbox Code Playgroud)\n