我对Zalgo有一些问题。
\n\n像下面这样的文字弄乱了我的图像板。有没有办法防止这些字符并“修复”或清理文本?
\n\n示例文本来源:
\n\n全部都是 LOS\xcc\x81\xcc\x8f\xcd\x84\xcd\x96\xcc\xa9\xcd\x87\xcc\x97\xcc\xaaT 全部 I\xe2\x80\x8bS 丢失了 pon\xcc\xb7y他来了他c\xcc\xb6\xcc\xaeomes 他来了ich\xe2\x80\x8bor 渗透到我的脸我的脸\xe1\xb5\x92h 上帝没有NOO\xcc\xbcO\xe2\x80\x8bO N\ xce\x98 停止 an\xe2\x80\x8b*\xcd\x91\xcc\xbe\xcc\xbe\xcc\xb6\xe2\x80\x8b\xcc\x85\xcd\xab\xcd\x8f\xcc\ x99\xcc\xa4g\xcd\x9b\xcd\x86\xcc\xbe\xcd\xab\xcc\x91\xcd\x86\xcd\x87\xcc\xabl\xcc\x8d\xcd\xab\xcd\xa5\ xcd\xa8\xcd\x96\xcd\x89\xcc\x97\xcc\xa9\xcc\xb3\xcc\x9fe\xcc\x85\xcc\xa0s\xcd\x8ea\xcc\xa7\xcd\x88\xcd\ x96r\xcc\xbd\xcc\xbe\xcd\x84\xcd\x92\xcd\x91e n\xe2\x80\x8bot 重新\xcc\x80\xcc\x91\xcd\xa7\xcc\x8ca\xcd\xa8l\ xcc\x83\xcd\xa4\xcd\x82\xcc\xbe\xcc\x86\xcc\x98\xcc\x9d\xcc\x99 ZA\xcd\xa0\xcc\xa1\xcd\x8a\xcd\x9dLG\xce \x8c IS\xcd\xae\xcc\x82\xd2\x89\xcc\xaf\xcd\x88\xcd\x95\xcc\xb9\xcc\x98\xcc\xb1 TO\xcd\x85\xcd\x87\xcc \xb9\xcc\xba\xc6\x9d\xcc\xb4\xc8\xb3\xcc\xb3 TH\xcc\x98E\xcd\x84\xcc\x89\xcd\x96\xcd\xa0P\xcc\xaf\xcd\ x8d\xcc\xadO\xcc\x9a\xe2\x80\x8bN\xcc\x90Y\xcc\xa1 H\xcd\xa8\xcd\x8a\xcc\xbd\xcc\x85\xcc\xbe\xcc\x8e\xcc \xa1\xcc\xb8\xcc\xaa\xcc\xafE\xcc\xbe\xcd\x9b\xcd\xaa\xcd\x84\xcc\x80\xcc\x81\xcc\xa7\xcd\x98\xcc\xac \xcc\xa9 \xcd\xa7\xcc\xbe\xcd\xac\xcc\xa7\xcc\xb6\xcc\xa8\xcc\xb1\xcc\xb9\xcc\xad\xcc\xafC\xcd\xad\xcc \x8f\xcd\xa5\xcd\xae\xcd\x9f\xcc\xb7\xcc\x99\xcc\xb2\xcc\x9d\xcd\x96O\xcd\xae\xcd\x8f\xcc\xae\xcc\xaa \xcc\x9d\xcd\x8dM\xcd\x8a\xcc\x92\xcc\x9a\xcd\xaa\xcd\xa9\xcd\xac\xcc\x9a\xcd\x9c\xcc\xb2\xcc\x96E\xcc \x91\xcd\xa9\xcd\x8c\xcd\x9d\xcc\xb4\xcc\x9f\xcc\x9f\xcd\x99\xcc\x9eS\xcd\xaf\xcc\xbf\xcc\x94\xcc\xa8 \xcd\x80\xcc\xa5\xcd\x85\xcc\xab\xcd\x8e\xcc\xad
\n\n我尝试使用这个解决方案:
\n\n$cleanMessage = preg_replace("/[^\\x20-\\xAD\\x7F]/", "", $input_lines);\n
Run Code Online (Sandbox Code Playgroud)\n\n摘自此处:删除与格式混淆的特殊字符\n但它仅适用于拉丁字符\n任何人都可以帮助我吗?
\n