在MySQL中检测utf8中断的字符

The*_*tor 37 mysql utf-8

我有一个数据库,其中包含分散在几个表中的一堆破坏的utf8字符.字符列表不是很广泛AFAIK(áéíúóÁÉÍÓÚÑñ)

修复给定的表格非常简单

update orderItem set itemName=replace(itemName,'á','á');
Run Code Online (Sandbox Code Playgroud)

但我无法找到一种检测破碎字符的方法.如果我做的事情

SELECT * FROM TABLE WHERE field LIKE "%Ã%";
Run Code Online (Sandbox Code Playgroud)

由于整理(Ã= a),我得到几乎所有的字段.到目前为止,所有破碎的字符都以"Ã"开头.数据库是西班牙语,因此不使用此特定字符

到目前为止我已经破碎的字符列表是

á = á
é = é
í- = í
ó = ó
ñ = ñ
á = Á
Run Code Online (Sandbox Code Playgroud)

知道如何让这个SELECT按预期工作吗?(二分搜索或类似的东西)

Tha*_*lin 57

我修好了

UPDATE wp_zcs9ck_posts_copy SET post_title = 
    CONVERT(BINARY CONVERT(post_title USING latin1) USING utf8);
Run Code Online (Sandbox Code Playgroud)

完整的解决方案:http://jonisalonen.com/2012/fixing-doubly-utf-8-encoded-text-in-mysql/

  • 最佳方案!非常感谢. (4认同)
  • 我在上面的注释中做了不必要的子查询。请使用此`SELECT名称,CONVERT(BINARY CONVERT(名称,使用latin1),使用utf8)作为来自表WHERE LENGTH(name)!= CHAR_LENGTH(name)的转换,代替 (3认同)
  • 首先备份你的数据库...这删除了我的wp_posts - > post_content字段中的前半部分句子! (2认同)

小智 39

UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'á','á');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ä','ä');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'é','é');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'í©','é');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ó','ó');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'íº','ú');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ú','ú');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'ñ','ñ');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'í‘','Ñ');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'Ã','í');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'–','–');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'’','\'');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'…','...');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'–','-');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'“','"');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'â€','"');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'‘','\'');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'•','-');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name`,'‡','c');
UPDATE `table_name` SET `column_name` = REPLACE(`column_name` ,'Â','');
Run Code Online (Sandbox Code Playgroud)


小智 16

谢谢你的回答!!

我用这个修复了我的表,并希望分享完整的更改列表.请注意,它还包括修复html解码字符,除了拉丁字符,它真的是一团糟:

update `table` set `field` = replace(`field`, 'â€', '"');
update `table` set `field` = replace(`field`, '–', '–');
update `table` set `field` = replace(`field`, '•', '-');
update `table` set `field` = replace(`field`, '“', '"');

update `table` set `field` = replace(`field`, '¡', '¡');
update `table` set `field` = replace(`field`, '¢', '¢');
update `table` set `field` = replace(`field`, '£', '£');
update `table` set `field` = replace(`field`, '¤', '¤');
update `table` set `field` = replace(`field`, 'Â¥', '¥');
update `table` set `field` = replace(`field`, '¦', '¦');
update `table` set `field` = replace(`field`, '§', '§');
update `table` set `field` = replace(`field`, '¨', '¨');
update `table` set `field` = replace(`field`, '©', '©');
update `table` set `field` = replace(`field`, 'ª', 'ª');
update `table` set `field` = replace(`field`, '«', '«');
update `table` set `field` = replace(`field`, '¬', '¬');
# This one looks like it's missing a character, but it's there. 0xad
update `table` set `field` = replace(`field`, '­', '­');
update `table` set `field` = replace(`field`, '®', '®');
update `table` set `field` = replace(`field`, '¯', '¯');
update `table` set `field` = replace(`field`, '°', '°');
update `table` set `field` = replace(`field`, '±', '±');
update `table` set `field` = replace(`field`, '²', '²');
update `table` set `field` = replace(`field`, '³', '³');
update `table` set `field` = replace(`field`, '´', '´');
update `table` set `field` = replace(`field`, 'µ', 'µ');
update `table` set `field` = replace(`field`, '¶', '¶');
update `table` set `field` = replace(`field`, '·', '·');
update `table` set `field` = replace(`field`, '¸', '¸');
update `table` set `field` = replace(`field`, '¹', '¹');
update `table` set `field` = replace(`field`, 'º', 'º');
update `table` set `field` = replace(`field`, '»', '»');
update `table` set `field` = replace(`field`, '¼', '¼');
update `table` set `field` = replace(`field`, '½', '½');
update `table` set `field` = replace(`field`, '¾', '¾');
update `table` set `field` = replace(`field`, '¿', '¿');

update `table` set `field` = replace(`field`, 'À', 'À');
# This one looks like it's missing a character, but it's there. 0x81
update `table` set `field` = replace(`field`, 'Ã', 'Á');
update `table` set `field` = replace(`field`, 'Â', 'Â');
update `table` set `field` = replace(`field`, 'Ã', 'Ã');
update `table` set `field` = replace(`field`, 'Ä', 'Ä');
update `table` set `field` = replace(`field`, 'Ã…', 'Å');
update `table` set `field` = replace(`field`, 'Æ', 'Æ');
update `table` set `field` = replace(`field`, 'Ç', 'Ç');
update `table` set `field` = replace(`field`, 'È', 'È');
update `table` set `field` = replace(`field`, 'É', 'É');
update `table` set `field` = replace(`field`, 'Ê', 'Ê');
update `table` set `field` = replace(`field`, 'Ë', 'Ë');
update `table` set `field` = replace(`field`, 'ÃŒ', 'Ì');
# This one looks like it's missing a character, but it's there. 0x8d
update `table` set `field` = replace(`field`, 'Ã', 'Í');
update `table` set `field` = replace(`field`, 'ÃŽ', 'Î');
# This one looks like it's missing a character, but it's there. 0x8f
update `table` set `field` = replace(`field`, 'Ã', 'Ï');
# This one looks like it's missing a character, but it's there. 0x90
update `table` set `field` = replace(`field`, 'Ã', 'Ð');
update `table` set `field` = replace(`field`, 'Ñ', 'Ñ');
update `table` set `field` = replace(`field`, 'Ã’', 'Ò');
update `table` set `field` = replace(`field`, 'Ó', 'Ó');
update `table` set `field` = replace(`field`, 'Ô', 'Ô');
update `table` set `field` = replace(`field`, 'Õ', 'Õ');
update `table` set `field` = replace(`field`, 'Ö', 'Ö');
update `table` set `field` = replace(`field`, '×', '×');
update `table` set `field` = replace(`field`, 'Ø', 'Ø');
update `table` set `field` = replace(`field`, 'Ù', 'Ù');
update `table` set `field` = replace(`field`, 'Ú', 'Ú');
update `table` set `field` = replace(`field`, 'Û', 'Û');
update `table` set `field` = replace(`field`, 'Ü', 'Ü');
# This one looks like it's missing a character, but it's there. 0x9d
update `table` set `field` = replace(`field`, 'Ã', 'Ý');
update `table` set `field` = replace(`field`, 'Þ', 'Þ');
update `table` set `field` = replace(`field`, 'ß', 'ß');
update `table` set `field` = replace(`field`, 'à', 'à');
update `table` set `field` = replace(`field`, 'á', 'á');
update `table` set `field` = replace(`field`, 'â', 'â');
update `table` set `field` = replace(`field`, 'ã', 'ã');
update `table` set `field` = replace(`field`, 'ä', 'ä');
update `table` set `field` = replace(`field`, 'Ã¥', 'å');
update `table` set `field` = replace(`field`, 'æ', 'æ');
update `table` set `field` = replace(`field`, 'ç', 'ç');
update `table` set `field` = replace(`field`, 'è', 'è');
update `table` set `field` = replace(`field`, 'é', 'é');
update `table` set `field` = replace(`field`, 'ê', 'ê');
update `table` set `field` = replace(`field`, 'ë', 'ë');
update `table` set `field` = replace(`field`, 'ì', 'ì');
# This one looks like it's missing a character, but it's there. 0xad
update `table` set `field` = replace(`field`, '­­Ã', 'í');
update `table` set `field` = replace(`field`, 'î', 'î');
update `table` set `field` = replace(`field`, 'ï', 'ï');
update `table` set `field` = replace(`field`, 'ð', 'ð');
update `table` set `field` = replace(`field`, 'ñ', 'ñ');
update `table` set `field` = replace(`field`, 'ò', 'ò');
update `table` set `field` = replace(`field`, 'ó', 'ó');
update `table` set `field` = replace(`field`, 'ô', 'ô');
update `table` set `field` = replace(`field`, 'õ', 'õ');
update `table` set `field` = replace(`field`, 'ö', 'ö');
update `table` set `field` = replace(`field`, '÷', '÷');
update `table` set `field` = replace(`field`, 'ø', 'ø');
update `table` set `field` = replace(`field`, 'ù', 'ù');
update `table` set `field` = replace(`field`, 'ú', 'ú');
update `table` set `field` = replace(`field`, 'û', 'û');
update `table` set `field` = replace(`field`, 'ü', 'ü');
update `table` set `field` = replace(`field`, 'ý', 'ý');
update `table` set `field` = replace(`field`, 'þ', 'þ');
update `table` set `field` = replace(`field`, 'ÿ', 'ÿ');

update `table` set `field` = replace(`field` ,'ç','ç');
update `table` set `field` = replace(`field` ,'ã','ã');
update `table` set `field` = replace(`field` ,'á','á');
update `table` set `field` = replace(`field` ,'â','â');
update `table` set `field` = replace(`field` ,'é','é');
update `table` set `field` = replace(`field` ,'í','í');
update `table` set `field` = replace(`field` ,'õ','õ');
update `table` set `field` = replace(`field` ,'ú','ú');
update `table` set `field` = replace(`field` ,'ç','ç');
update `table` set `field` = replace(`field` ,'Á','Á');
update `table` set `field` = replace(`field` ,'Â','Â');
update `table` set `field` = replace(`field` ,'É','É');
update `table` set `field` = replace(`field` ,'Í','Í');
update `table` set `field` = replace(`field` ,'Õ','Õ');
update `table` set `field` = replace(`field` ,'Ú','Ú');
update `table` set `field` = replace(`field` ,'Ç','Ç');
update `table` set `field` = replace(`field` ,'Ã','Ã');
update `table` set `field` = replace(`field` ,'À','À');
update `table` set `field` = replace(`field` ,'Ê','Ê');
update `table` set `field` = replace(`field` ,'Ó','Ó');
update `table` set `field` = replace(`field` ,'Ô','Ô');
update `table` set `field` = replace(`field` ,'Ü','Ü');
update `table` set `field` = replace(`field` ,'ã','ã');
update `table` set `field` = replace(`field` ,'à','à');
update `table` set `field` = replace(`field` ,'ê','ê');
update `table` set `field` = replace(`field` ,'ó','ó');
update `table` set `field` = replace(`field` ,'ô','ô');
update `table` set `field` = replace(`field` ,'ü','ü');
update `table` set `field` = replace(`field` ,'&','&');
update `table` set `field` = replace(`field` ,'>','>');
update `table` set `field` = replace(`field` ,'&lt;','<');
update `table` set `field` = replace(`field` ,'&circ;','ˆ');
update `table` set `field` = replace(`field` ,'&tilde;','˜');
update `table` set `field` = replace(`field` ,'&uml;','¨');
update `table` set `field` = replace(`field` ,'&cute;','´');
update `table` set `field` = replace(`field` ,'&cedil;','¸');
update `table` set `field` = replace(`field` ,'&quot;','"');
update `table` set `field` = replace(`field` ,'&ldquo;','“');
update `table` set `field` = replace(`field` ,'&rdquo;','”');
update `table` set `field` = replace(`field` ,'&lsquo;','‘');
update `table` set `field` = replace(`field` ,'&rsquo;','’');
update `table` set `field` = replace(`field` ,'&lsaquo;','‹');
update `table` set `field` = replace(`field` ,'&rsaquo;','›');
update `table` set `field` = replace(`field` ,'&laquo;','«');
update `table` set `field` = replace(`field` ,'&raquo;','»');
update `table` set `field` = replace(`field` ,'&ordm;','º');
update `table` set `field` = replace(`field` ,'&ordf;','ª');
update `table` set `field` = replace(`field` ,'&ndash;','–');
update `table` set `field` = replace(`field` ,'&mdash;','—');
update `table` set `field` = replace(`field` ,'&macr;','¯');
update `table` set `field` = replace(`field` ,'&hellip;','…');
update `table` set `field` = replace(`field` ,'&brvbar;','¦');
update `table` set `field` = replace(`field` ,'&bull;','•');
update `table` set `field` = replace(`field` ,'&para;','¶');
update `table` set `field` = replace(`field` ,'&sect;','§');
update `table` set `field` = replace(`field` ,'&sup1;','¹');
update `table` set `field` = replace(`field` ,'&sup2;','²');
update `table` set `field` = replace(`field` ,'&sup3;','³');
update `table` set `field` = replace(`field` ,'&frac12;','½');
update `table` set `field` = replace(`field` ,'&frac14;','¼');
update `table` set `field` = replace(`field` ,'&frac34;','¾');
update `table` set `field` = replace(`field` ,'&#8539;','?');
update `table` set `field` = replace(`field` ,'&#8540;','?');
update `table` set `field` = replace(`field` ,'&#8541;','?');
update `table` set `field` = replace(`field` ,'&#8542;','?');
update `table` set `field` = replace(`field` ,'&gt;','>');
update `table` set `field` = replace(`field` ,'&lt;','<');
update `table` set `field` = replace(`field` ,'&plusmn;','±');
update `table` set `field` = replace(`field` ,'&minus;','?');
update `table` set `field` = replace(`field` ,'&times;','×');
update `table` set `field` = replace(`field` ,'&divide;','÷');
update `table` set `field` = replace(`field` ,'&lowast;','?');
update `table` set `field` = replace(`field` ,'&frasl;','?');
update `table` set `field` = replace(`field` ,'&permil;','‰');
update `table` set `field` = replace(`field` ,'&int;','?');
update `table` set `field` = replace(`field` ,'&sum;','?');
update `table` set `field` = replace(`field` ,'&prod;','?');
update `table` set `field` = replace(`field` ,'&radic;','?');
update `table` set `field` = replace(`field` ,'&infin;','?');
update `table` set `field` = replace(`field` ,'&asymp;','?');
update `table` set `field` = replace(`field` ,'&cong;','?');
update `table` set `field` = replace(`field` ,'&prop;','?');
update `table` set `field` = replace(`field` ,'&equiv;','?');
update `table` set `field` = replace(`field` ,'&ne;','?');
update `table` set `field` = replace(`field` ,'&le;','?');
update `table` set `field` = replace(`field` ,'&ge;','?');
update `table` set `field` = replace(`field` ,'&there4;','?');
update `table` set `field` = replace(`field` ,'&sdot;','?');
update `table` set `field` = replace(`field` ,'&middot;','·');
update `table` set `field` = replace(`field` ,'&part;','?');
update `table` set `field` = replace(`field` ,'&image;','?');
update `table` set `field` = replace(`field` ,'&real;','?');
update `table` set `field` = replace(`field` ,'&prime;','?');
update `table` set `field` = replace(`field` ,'&Prime;','?');
update `table` set `field` = replace(`field` ,'&deg;','°');
update `table` set `field` = replace(`field` ,'&ang;','?');
update `table` set `field` = replace(`field` ,'&perp;','?');
update `table` set `field` = replace(`field` ,'&nabla;','?');
update `table` set `field` = replace(`field` ,'&oplus;','?');
update `table` set `field` = replace(`field` ,'&otimes;','?');
update `table` set `field` = replace(`field` ,'&alefsym;','?');
update `table` set `field` = replace(`field` ,'&oslash;','ø');
update `table` set `field` = replace(`field` ,'&Oslash;','Ø');
update `table` set `field` = replace(`field` ,'&isin;','?');
update `table` set `field` = replace(`field` ,'&notin;','?');
update `table` set `field` = replace(`field` ,'&cap;','?');
update `table` set `field` = replace(`field` ,'&cup;','?');
update `table` set `field` = replace(`field` ,'&sub;','?');
update `table` set `field` = replace(`field` ,'&sup;','?');
update `table` set `field` = replace(`field` ,'&sube;','?');
update `table` set `field` = replace(`field` ,'&supe;','?');
update `table` set `field` = replace(`field` ,'&exist;','?');
update `table` set `field` = replace(`field` ,'&forall;','?');
update `table` set `field` = replace(`field` ,'&empty;','?');
update `table` set `field` = replace(`field` ,'&not;','¬');
update `table` set `field` = replace(`field` ,'&and;','?');
update `table` set `field` = replace(`field` ,'&or;','?');
update `table` set `field` = replace(`field` ,'&crarr;','?');
Run Code Online (Sandbox Code Playgroud)

  • 这是一个很棒的列表,但排序是有问题的,因为例如,当搜索“…”时,开头的 – 部分将被替换。 (2认同)

Dav*_*vid 13

没有文本替换是一种通用解决方案,因为您可以忘记一些字符.对双转换字符更合适的修复方法是:

  1. 转换回latin1
  2. 转换为二进制
  3. 转换为utf8

像这样:

alter table descriptions modify name VARCHAR(2000) character set latin1;
alter table descriptions modify name blob;
alter table descriptions modify name VARCHAR(2000) character set utf8;
Run Code Online (Sandbox Code Playgroud)


Ada*_*nch 12

SELECT您需要的声明如下:

SELECT * FROM TABLE WHERE LENGTH(name) != CHAR_LENGTH(name);
Run Code Online (Sandbox Code Playgroud)

这将返回包含多字节字符的所有行.

name假设是一个字段/可以找到奇怪字符的字段.*


Pab*_*eco 10

这救了我的命

UPDATE ohp_posts SET post_content = CONVERT(CAST(CONVERT(post_content USING latin1) AS BINARY) USING utf8)
Run Code Online (Sandbox Code Playgroud)

我在这里找到了它http://stanis.net/2014/04/replacing-latin-1-with-utf-8-characters-in-mysql/


wds*_*wds 7

如何采用不同的方法,即来回转换列以获得正确的字符集?您可以将其转换为二进制,然后转换为utf-8,然后转换为iso-8859-1或您正在使用的任何其他内容.有关详细信息,请参阅手册.


Med*_*dhi 6

要将所有拉丁字符转换为正确的重音,请在 MySQL 上尝试以下操作:

UPDATE your_table SET your_column = CONVERT(CAST(CONVERT(your_column USING latin1) AS BINARY) USING utf8)
Run Code Online (Sandbox Code Playgroud)