【举例】波兰语Śląskie中的Ś和ą字母显示的是?
1.修改配置文件my.ini里的:
default-character-set = utf8
character-set-server = utf8
2.命令行设置:
mysql> show variables like 'character%';
+--------------------------+----------------------------------------------------
-----+
| Variable_name | Value
|
+--------------------------+----------------------------------------------------
-----+
| character_set_client | utf8
|
| character_set_connection | utf8
|
| character_set_database | utf8
|
| character_set_filesystem | binary
|
| character_set_results | utf8
|
| character_set_server | utf8
|
| character_set_system | utf8
|
| character_sets_dir | C:\Program Files\MySQL\MySQL Server 5.5\share\chars
ets\ |
+--------------------------+----------------------------------------------------
-----+
8 rows in set (0.05 sec)
mysql> show variables like 'collation%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_general_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
mysql>
3.针对我创建的数据库,设置字符集为utf8:
alter database 数据库名字 character set utf8;
4.数据库表的字符集也是utf8
mysql> show create table customer_info;
------------------------------------------------------+
| Table | Create Table
|
------------------------------------------------------+
| customer_info | CREATE TABLE customer_info
(CUST_ID
int(11) NOT NULL DEFAULT '0',CUST_CODE
varchar(32) NOT NULL DEFAULT '',CUST_NAME
varchar(100) NOT NULL DEFAULT '',CUST_CATE
char(1) DEFAULT 'N',CURRENCY
char(3) DEFAULT 'EUR',CONTINENT
char(1) DEFAULT 'E',REGION
char(2) NOT NULL DEFAULT '',PROVINCE
varchar(30) DEFAULT '',CITY
varchar(30) DEFAULT '',ADDRESS
varchar(150) DEFAULT '',ZIP
varchar(10) DEFAULT '',EMAIL
varchar(100) DEFAULT '',WEBSITE
varchar(100) DEFAULT '',LINKMAN
varchar(50) DEFAULT '',PHONE
varchar(20) DEFAULT '',FAX
varchar(20) DEFAULT '',SP_RATE
decimal(12,5) DEFAULT '1.00000',DATE_FORMAT
varchar(20) DEFAULT 'dd MMM, yy',STATUS
char(1) DEFAULT 'N',REMARKS
varchar(200) DEFAULT '',EDITOR
varchar(32) NOT NULL DEFAULT '',LAST_UPDATE
datetime DEFAULT NULL,SHIP_REGION
char(2) NOT NULL DEFAULT '',SHIP_PROVINCE
varchar(30) DEFAULT '',SHIP_CITY
varchar(30) DEFAULT '',SHIP_ADDRESS
varchar(150) DEFAULT '',SHIP_ZIP
varchar(10) DEFAULT '',TAX_ID
varchar(50) DEFAULT '',OTHER_BRANDS
varchar(150) DEFAULT '',COMP_TYPE
char(1) DEFAULT 'C',SHOP_TYPE
char(1) DEFAULT 'S',COMMENTS
text,STORE_IMAGE
varchar(50) DEFAULT '',
PRIMARY KEY (CUST_ID
),
UNIQUE KEY CUSTOMER_INFO_CODE
(CUST_CODE
)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
------------------------------------------------------+
1 row in set (0.02 sec)
mysql>
5.在.jsp文件里:
contentType也设置了字符集为utf8;
pageEncoding也设置为utf8
[color=red][size=medium]基本上能设置的都设置了,还有什么需要设置的么?[/size][/color]
附:
1.mysql版本是5.5
2.
mysql> show character set;
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
| cp866 | DOS Russian | cp866_general_ci | 1 |
| keybcs2 | DOS Kamenicky Czech-Slovak | keybcs2_general_ci | 1 |
| macce | Mac Central European | macce_general_ci | 1 |
| macroman | Mac West European | macroman_general_ci | 1 |
| cp852 | DOS Central European | cp852_general_ci | 1 |
| latin7 | ISO 8859-13 Baltic | latin7_general_ci | 1 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
| cp1251 | Windows Cyrillic | cp1251_general_ci | 1 |
| utf16 | UTF-16 Unicode | utf16_general_ci | 4 |
| cp1256 | Windows Arabic | cp1256_general_ci | 1 |
| cp1257 | Windows Baltic | cp1257_general_ci | 1 |
| utf32 | UTF-32 Unicode | utf32_general_ci | 4 |
| binary | Binary pseudo charset | binary | 1 |
| geostd8 | GEOSTD8 Georgian | geostd8_general_ci | 1 |
| cp932 | SJIS for Windows Japanese | cp932_japanese_ci | 2 |
| eucjpms | UJIS for Windows Japanese | eucjpms_japanese_ci | 3 |
+----------+-----------------------------+---------------------+--------+
39 rows in set (0.00 sec)
mysql> show collation
-> ;
+--------------------------+----------+-----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+--------------------------+----------+-----+---------+----------+---------+
| big5_chinese_ci | big5 | 1 | Yes | Yes | 1 |
| big5_bin | big5 | 84 | | Yes | 1 |
| dec8_swedish_ci | dec8 | 3 | Yes | Yes | 1 |
| dec8_bin | dec8 | 69 | | Yes | 1 |
| cp850_general_ci | cp850 | 4 | Yes | Yes | 1 |
| cp850_bin | cp850 | 80 | | Yes | 1 |
| hp8_english_ci | hp8 | 6 | Yes | Yes | 1 |
| hp8_bin | hp8 | 72 | | Yes | 1 |
| koi8r_general_ci | koi8r | 7 | Yes | Yes | 1 |
| koi8r_bin | koi8r | 74 | | Yes | 1 |
| latin1_german1_ci | latin1 | 5 | | Yes | 1 |
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 |
| latin1_danish_ci | latin1 | 15 | | Yes | 1 |
| latin1_german2_ci | latin1 | 31 | | Yes | 2 |
| latin1_bin | latin1 | 47 | | Yes | 1 |
| latin1_general_ci | latin1 | 48 | | Yes | 1 |
| latin1_general_cs | latin1 | 49 | | Yes | 1 |
| latin1_spanish_ci | latin1 | 94 | | Yes | 1 |
| latin2_czech_cs | latin2 | 2 | | Yes | 4 |
| latin2_general_ci | latin2 | 9 | Yes | Yes | 1 |
| latin2_hungarian_ci | latin2 | 21 | | Yes | 1 |
| latin2_croatian_ci | latin2 | 27 | | Yes | 1 |
| latin2_bin | latin2 | 77 | | Yes | 1 |
| swe7_swedish_ci | swe7 | 10 | Yes | Yes | 1 |
| swe7_bin | swe7 | 82 | | Yes | 1 |
| ascii_general_ci | ascii | 11 | Yes | Yes | 1 |
| ascii_bin | ascii | 65 | | Yes | 1 |
| ujis_japanese_ci | ujis | 12 | Yes | Yes | 1 |
| ujis_bin | ujis | 91 | | Yes | 1 |
| sjis_japanese_ci | sjis | 13 | Yes | Yes | 1 |
| sjis_bin | sjis | 88 | | Yes | 1 |
| hebrew_general_ci | hebrew | 16 | Yes | Yes | 1 |
| hebrew_bin | hebrew | 71 | | Yes | 1 |
| tis620_thai_ci | tis620 | 18 | Yes | Yes | 4 |
| tis620_bin | tis620 | 89 | | Yes | 1 |
| euckr_korean_ci | euckr | 19 | Yes | Yes | 1 |
| euckr_bin | euckr | 85 | | Yes | 1 |
| koi8u_general_ci | koi8u | 22 | Yes | Yes | 1 |
| koi8u_bin | koi8u | 75 | | Yes | 1 |
| gb2312_chinese_ci | gb2312 | 24 | Yes | Yes | 1 |
| gb2312_bin | gb2312 | 86 | | Yes | 1 |
| greek_general_ci | greek | 25 | Yes | Yes | 1 |
| greek_bin | greek | 70 | | Yes | 1 |
| cp1250_general_ci | cp1250 | 26 | Yes | Yes | 1 |
| cp1250_czech_cs | cp1250 | 34 | | Yes | 2 |
| cp1250_croatian_ci | cp1250 | 44 | | Yes | 1 |
| cp1250_bin | cp1250 | 66 | | Yes | 1 |
| cp1250_polish_ci | cp1250 | 99 | | Yes | 1 |
| gbk_chinese_ci | gbk | 28 | Yes | Yes | 1 |
| gbk_bin | gbk | 87 | | Yes | 1 |
| latin5_turkish_ci | latin5 | 30 | Yes | Yes | 1 |
| latin5_bin | latin5 | 78 | | Yes | 1 |
| armscii8_general_ci | armscii8 | 32 | Yes | Yes | 1 |
| armscii8_bin | armscii8 | 64 | | Yes | 1 |
| utf8_general_ci | utf8 | 33 | Yes | Yes | 1 |
| utf8_bin | utf8 | 83 | | Yes | 1 |
| utf8_unicode_ci | utf8 | 192 | | Yes | 8 |
| utf8_icelandic_ci | utf8 | 193 | | Yes | 8 |
| utf8_latvian_ci | utf8 | 194 | | Yes | 8 |
| utf8_romanian_ci | utf8 | 195 | | Yes | 8 |
| utf8_slovenian_ci | utf8 | 196 | | Yes | 8 |
| utf8_polish_ci | utf8 | 197 | | Yes | 8 |
| utf8_estonian_ci | utf8 | 198 | | Yes | 8 |
| utf8_spanish_ci | utf8 | 199 | | Yes | 8 |
| utf8_swedish_ci | utf8 | 200 | | Yes | 8 |
| utf8_turkish_ci | utf8 | 201 | | Yes | 8 |
| utf8_czech_ci | utf8 | 202 | | Yes | 8 |
| utf8_danish_ci | utf8 | 203 | | Yes | 8 |
| utf8_lithuanian_ci | utf8 | 204 | | Yes | 8 |
| utf8_slovak_ci | utf8 | 205 | | Yes | 8 |
| utf8_spanish2_ci | utf8 | 206 | | Yes | 8 |
| utf8_roman_ci | utf8 | 207 | | Yes | 8 |
| utf8_persian_ci | utf8 | 208 | | Yes | 8 |
| utf8_esperanto_ci | utf8 | 209 | | Yes | 8 |
| utf8_hungarian_ci | utf8 | 210 | | Yes | 8 |
| utf8_sinhala_ci | utf8 | 211 | | Yes | 8 |
| utf8_general_mysql500_ci | utf8 | 223 | | Yes | 1 |
| ucs2_general_ci | ucs2 | 35 | Yes | Yes | 1 |
| ucs2_bin | ucs2 | 90 | | Yes | 1 |
| ucs2_unicode_ci | ucs2 | 128 | | Yes | 8 |
| ucs2_icelandic_ci | ucs2 | 129 | | Yes | 8 |
| ucs2_latvian_ci | ucs2 | 130 | | Yes | 8 |
| ucs2_romanian_ci | ucs2 | 131 | | Yes | 8 |
| ucs2_slovenian_ci | ucs2 | 132 | | Yes | 8 |
| ucs2_polish_ci | ucs2 | 133 | | Yes | 8 |
| ucs2_estonian_ci | ucs2 | 134 | | Yes | 8 |
| ucs2_spanish_ci | ucs2 | 135 | | Yes | 8 |
| ucs2_swedish_ci | ucs2 | 136 | | Yes | 8 |
| ucs2_turkish_ci | ucs2 | 137 | | Yes | 8 |
| ucs2_czech_ci | ucs2 | 138 | | Yes | 8 |
| ucs2_danish_ci | ucs2 | 139 | | Yes | 8 |
| ucs2_lithuanian_ci | ucs2 | 140 | | Yes | 8 |
| ucs2_slovak_ci | ucs2 | 141 | | Yes | 8 |
| ucs2_spanish2_ci | ucs2 | 142 | | Yes | 8 |
| ucs2_roman_ci | ucs2 | 143 | | Yes | 8 |
| ucs2_persian_ci | ucs2 | 144 | | Yes | 8 |
| ucs2_esperanto_ci | ucs2 | 145 | | Yes | 8 |
| ucs2_hungarian_ci | ucs2 | 146 | | Yes | 8 |
| ucs2_sinhala_ci | ucs2 | 147 | | Yes | 8 |
| ucs2_general_mysql500_ci | ucs2 | 159 | | Yes | 1 |
| cp866_general_ci | cp866 | 36 | Yes | Yes | 1 |
| cp866_bin | cp866 | 68 | | Yes | 1 |
| keybcs2_general_ci | keybcs2 | 37 | Yes | Yes | 1 |
| keybcs2_bin | keybcs2 | 73 | | Yes | 1 |
| macce_general_ci | macce | 38 | Yes | Yes | 1 |
| macce_bin | macce | 43 | | Yes | 1 |
| macroman_general_ci | macroman | 39 | Yes | Yes | 1 |
| macroman_bin | macroman | 53 | | Yes | 1 |
| cp852_general_ci | cp852 | 40 | Yes | Yes | 1 |
| cp852_bin | cp852 | 81 | | Yes | 1 |
| latin7_estonian_cs | latin7 | 20 | | Yes | 1 |
| latin7_general_ci | latin7 | 41 | Yes | Yes | 1 |
| latin7_general_cs | latin7 | 42 | | Yes | 1 |
| latin7_bin | latin7 | 79 | | Yes | 1 |
| utf8mb4_general_ci | utf8mb4 | 45 | Yes | Yes | 1 |
| utf8mb4_bin | utf8mb4 | 46 | | Yes | 1 |
| utf8mb4_unicode_ci | utf8mb4 | 224 | | Yes | 8 |
| utf8mb4_icelandic_ci | utf8mb4 | 225 | | Yes | 8 |
| utf8mb4_latvian_ci | utf8mb4 | 226 | | Yes | 8 |
| utf8mb4_romanian_ci | utf8mb4 | 227 | | Yes | 8 |
| utf8mb4_slovenian_ci | utf8mb4 | 228 | | Yes | 8 |
| utf8mb4_polish_ci | utf8mb4 | 229 | | Yes | 8 |
| utf8mb4_estonian_ci | utf8mb4 | 230 | | Yes | 8 |
| utf8mb4_spanish_ci | utf8mb4 | 231 | | Yes | 8 |
| utf8mb4_swedish_ci | utf8mb4 | 232 | | Yes | 8 |
| utf8mb4_turkish_ci | utf8mb4 | 233 | | Yes | 8 |
| utf8mb4_czech_ci | utf8mb4 | 234 | | Yes | 8 |
| utf8mb4_danish_ci | utf8mb4 | 235 | | Yes | 8 |
| utf8mb4_lithuanian_ci | utf8mb4 | 236 | | Yes | 8 |
| utf8mb4_slovak_ci | utf8mb4 | 237 | | Yes | 8 |
| utf8mb4_spanish2_ci | utf8mb4 | 238 | | Yes | 8 |
| utf8mb4_roman_ci | utf8mb4 | 239 | | Yes | 8 |
| utf8mb4_persian_ci | utf8mb4 | 240 | | Yes | 8 |
| utf8mb4_esperanto_ci | utf8mb4 | 241 | | Yes | 8 |
| utf8mb4_hungarian_ci | utf8mb4 | 242 | | Yes | 8 |
| utf8mb4_sinhala_ci | utf8mb4 | 243 | | Yes | 8 |
| cp1251_bulgarian_ci | cp1251 | 14 | | Yes | 1 |
| cp1251_ukrainian_ci | cp1251 | 23 | | Yes | 1 |
| cp1251_bin | cp1251 | 50 | | Yes | 1 |
| cp1251_general_ci | cp1251 | 51 | Yes | Yes | 1 |
| cp1251_general_cs | cp1251 | 52 | | Yes | 1 |
| utf16_general_ci | utf16 | 54 | Yes | Yes | 1 |
| utf16_bin | utf16 | 55 | | Yes | 1 |
| utf16_unicode_ci | utf16 | 101 | | Yes | 8 |
| utf16_icelandic_ci | utf16 | 102 | | Yes | 8 |
| utf16_latvian_ci | utf16 | 103 | | Yes | 8 |
| utf16_romanian_ci | utf16 | 104 | | Yes | 8 |
| utf16_slovenian_ci | utf16 | 105 | | Yes | 8 |
| utf16_polish_ci | utf16 | 106 | | Yes | 8 |
| utf16_estonian_ci | utf16 | 107 | | Yes | 8 |
| utf16_spanish_ci | utf16 | 108 | | Yes | 8 |
| utf16_swedish_ci | utf16 | 109 | | Yes | 8 |
| utf16_turkish_ci | utf16 | 110 | | Yes | 8 |
| utf16_czech_ci | utf16 | 111 | | Yes | 8 |
| utf16_danish_ci | utf16 | 112 | | Yes | 8 |
| utf16_lithuanian_ci | utf16 | 113 | | Yes | 8 |
| utf16_slovak_ci | utf16 | 114 | | Yes | 8 |
| utf16_spanish2_ci | utf16 | 115 | | Yes | 8 |
| utf16_roman_ci | utf16 | 116 | | Yes | 8 |
| utf16_persian_ci | utf16 | 117 | | Yes | 8 |
| utf16_esperanto_ci | utf16 | 118 | | Yes | 8 |
| utf16_hungarian_ci | utf16 | 119 | | Yes | 8 |
| utf16_sinhala_ci | utf16 | 120 | | Yes | 8 |
| cp1256_general_ci | cp1256 | 57 | Yes | Yes | 1 |
| cp1256_bin | cp1256 | 67 | | Yes | 1 |
| cp1257_lithuanian_ci | cp1257 | 29 | | Yes | 1 |
| cp1257_bin | cp1257 | 58 | | Yes | 1 |
| cp1257_general_ci | cp1257 | 59 | Yes | Yes | 1 |
| utf32_general_ci | utf32 | 60 | Yes | Yes | 1 |
| utf32_bin | utf32 | 61 | | Yes | 1 |
| utf32_unicode_ci | utf32 | 160 | | Yes | 8 |
| utf32_icelandic_ci | utf32 | 161 | | Yes | 8 |
| utf32_latvian_ci | utf32 | 162 | | Yes | 8 |
| utf32_romanian_ci | utf32 | 163 | | Yes | 8 |
| utf32_slovenian_ci | utf32 | 164 | | Yes | 8 |
| utf32_polish_ci | utf32 | 165 | | Yes | 8 |
| utf32_estonian_ci | utf32 | 166 | | Yes | 8 |
| utf32_spanish_ci | utf32 | 167 | | Yes | 8 |
| utf32_swedish_ci | utf32 | 168 | | Yes | 8 |
| utf32_turkish_ci | utf32 | 169 | | Yes | 8 |
| utf32_czech_ci | utf32 | 170 | | Yes | 8 |
| utf32_danish_ci | utf32 | 171 | | Yes | 8 |
| utf32_lithuanian_ci | utf32 | 172 | | Yes | 8 |
| utf32_slovak_ci | utf32 | 173 | | Yes | 8 |
| utf32_spanish2_ci | utf32 | 174 | | Yes | 8 |
| utf32_roman_ci | utf32 | 175 | | Yes | 8 |
| utf32_persian_ci | utf32 | 176 | | Yes | 8 |
| utf32_esperanto_ci | utf32 | 177 | | Yes | 8 |
| utf32_hungarian_ci | utf32 | 178 | | Yes | 8 |
| utf32_sinhala_ci | utf32 | 179 | | Yes | 8 |
| binary | binary | 63 | Yes | Yes | 1 |
| geostd8_general_ci | geostd8 | 92 | Yes | Yes | 1 |
| geostd8_bin | geostd8 | 93 | | Yes | 1 |
| cp932_japanese_ci | cp932 | 95 | Yes | Yes | 1 |
| cp932_bin | cp932 | 96 | | Yes | 1 |
| eucjpms_japanese_ci | eucjpms | 97 | Yes | Yes | 1 |
| eucjpms_bin | eucjpms | 98 | | Yes | 1 |
+--------------------------+----------+-----+---------+----------+---------+
197 rows in set (0.00 sec)
还有很多地方涉及编码:
比如Java或者JSP代码文件本身的编码。
AP服务器设定的GET或者POST传输的编码。
JDBC驱动连接时候指定的编码。
如果内容是从外部文件读进来的,外部文件的编码也有影响。
。。。
如果能debug的话,可以看看在Java程序那边保存数据库之前在内存里面是否乱码了。
如果没有,基本可以肯定是JDBC驱动连接的写法或者数据库编码的问题。
如果已经乱了,可以从前台刚刚post到后台的时候debug看看数据是否乱码(或者是从文件里面读出来的时候),应该就是这时候乱了。这就是提交的数据没有正确编码。
如果数据不是提交的,是写死在程序里面的固定字符串,那么乱了的话就是程序文件本身编码不是UTF8。
不能debug的话就比较麻烦了,不过可以通过在上述地方加log来检测。
代码根本不让动就没招了。
mysql字符集编码
字符集和整理
整理 描述
armscii8 (ARMSCII-8 Armenian)
armscii8_bin 亚美尼亚语, 二进制
armscii8_general_ci 亚美尼亚语, 不区分大小写
ascii (US ASCII)
ascii_bin 西欧 (多语言), 二进制
ascii_general_ci 西欧 (多语言), 不区分大小写
big5 (Big5 Traditional Chinese)
big5_bin 繁体中文, 二进制
big5_chinese_ci 繁体中文, 不区分大小写
binary (Binary pseudo charset)
binary 二进制
cp1250 (Windows Central European)
cp1250_bin 中欧 (多语言), 二进制
cp1250_croatian_ci 克罗地亚语, 不区分大小写
cp1250_czech_cs 捷克语, 区分大小写
cp1250_general_ci 中欧 (多语言), 不区分大小写
cp1251 (Windows Cyrillic)
cp1251_bin 西里尔语 (多语言), 二进制
cp1251_bulgarian_ci 保加利亚语, 不区分大小写
cp1251_general_ci 西里尔语 (多语言), 不区分大小写
cp1251_general_cs 西里尔语 (多语言), 区分大小写
cp1251_ukrainian_ci 乌克兰语, 不区分大小写