MySQL认为'außer'等于'auser'

Fan*_*Lin 3 mysql sql encoding

我尝试将一些表从一个 MySQL 数据库迁移到另一个数据库,但遇到错误:

ERROR 1062 (23000) at line 108: Duplicate entry 'außer' for key 'PRIMARY'
Run Code Online (Sandbox Code Playgroud)

我试图找出为什么,在目标数据库中,我跑了

mysql> select 'außer' = 'auser';
+--------------------+
| 'außer' = 'auser'  |
+--------------------+
|                  1 |
+--------------------+
1 row in set (0.07 sec)
Run Code Online (Sandbox Code Playgroud)

如你所见,MySQL认为它们是相同的,我检查了配置变量

mysql> show variables like 'coll%';
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_general_ci |
| collation_server     | utf8_general_ci |
+----------------------+-----------------+

mysql> show variables like 'character%';
+--------------------------+------------------------------------------+
| Variable_name            | Value                                    |
+--------------------------+------------------------------------------+
| character_set_client     | utf8                                     |
| character_set_connection | utf8                                     |
| character_set_database   | utf8                                     |
| character_set_filesystem | binary                                   |
| character_set_results    | utf8                                     |
| character_set_server     | utf8                                     |
| character_set_system     | utf8                                     |
| character_sets_dir       | /rdsdbbin/mysql-5.5.8.R1/share/charsets/ |
+--------------------------+------------------------------------------+
Run Code Online (Sandbox Code Playgroud)

然后,我回到原始数据库并尝试

mysql> select 'außer' = 'auser';
+--------------------+
| 'außer' = 'auser'  |
+--------------------+
|                  0 |
+--------------------+
1 row in set (0.00 sec)

mysql> show variables like 'coll%';
+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_general_ci |
| collation_server     | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)

mysql> show variables like 'haracter%';
Empty set (0.00 sec)

mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

MySQL 原始版本为5.0.77,迁移目标为5.5.8。我不知道这怎么会发生。为什么他们比较字符串的方式不同?我怎么解决这个问题?谢谢。

fak*_*ker 9

正如http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-sets.html 中所述,这似乎是正确的行为:

utf8_general_ci 也适用于德语和法语,除了“ß”等于“s”,而不是“ss”。如果这对您的应用程序来说是可以接受的,您应该使用 utf8_general_ci 因为它更快。否则,请使用 utf8_unicode_ci 因为它更准确。