I need to get UTF-8 working in my Java webapp (servlets + JSP, no framework used) to support äöå
etc. for regular Finnish text and Cyrillic alphabets like ЦжФ
for special cases.
My setup is the following:
Database used: MySQL 5.x
Users mainly use Firefox2 but also Opera 9.x, FF3, IE7 and Google Chrome are used to access the site.
How to achieve this?
转载于:https://stackoverflow.com/questions/138948/how-to-get-utf-8-working-in-java-webapps
I think you summed it up quite well in your own answer.
In the process of UTF-8-ing(?) from end to end you might also want to make sure java itself is using UTF-8. Use -Dfile.encoding=utf-8 as parameter to the JVM (can be configured in catalina.bat).
This is for Greek Encoding in MySql tables when we want to access them using Java:
Use the following connection setup in your JBoss connection pool (mysql-ds.xml)
<connection-url>jdbc:mysql://192.168.10.123:3308/mydatabase</connection-url>
<driver-class>com.mysql.jdbc.Driver</driver-class>
<user-name>nts</user-name>
<password>xaxaxa!</password>
<connection-property name="useUnicode">true</connection-property>
<connection-property name="characterEncoding">greek</connection-property>
If you don't want to put this in a JNDI connection pool, you can configure it as a JDBC-url like the next line illustrates:
jdbc:mysql://192.168.10.123:3308/mydatabase?characterEncoding=greek
For me and Nick, so we never forget it and waste time anymore.....
In case you have specified in connection pool (mysql-ds.xml), in your Java code you can open the connection as follows:
DriverManager.registerDriver(new com.mysql.jdbc.Driver());
Connection conn = DriverManager.getConnection(
"jdbc:mysql://192.168.1.12:3308/mydb?characterEncoding=greek",
"Myuser", "mypass");
Nice detailed answer. just wanted to add one more thing which will definitely help others to see the UTF-8 encoding on URLs in action .
Follow the steps below to enable UTF-8 encoding on URLs in firefox.
type "about:config" in the address bar.
Use the filter input type to search for "network.standard-url.encode-query-utf8" property.
UTF-8 encoding on URLs works by default in IE6/7/8 and chrome.
I want also to add from here this part solved my utf problem:
runtime.encoding=<encoding>
I'm with a similar problem, but, in filenames of a file I'm compressing with apache commons. So, i resolved it with this command:
convmv --notest -f cp1252 -t utf8 * -r
it works very well for me. Hope it help anyone ;)
For my case of displaying Unicode character from message bundles, I don't need to apply "JSP page encoding" section to display Unicode on my jsp page. All I need is "CharsetFilter" section.
To add to kosoant's answer, if you are using Spring, rather than writing your own Servlet filter, you can use the class org.springframework.web.filter.CharacterEncodingFilter
they provide, configuring it like the following in your web.xml:
<filter>
<filter-name>encoding-filter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>FALSE</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>encoding-filter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
One other point that hasn't been mentioned relates to Java Servlets working with Ajax. I have situations where a web page is picking up utf-8 text from the user sending this to a JavaScript file which includes it in a URI sent to the Servlet. The Servlet queries a database, captures the result and returns it as XML to the JavaScript file which formats it and inserts the formatted response into the original web page.
In one web app I was following an early Ajax book's instructions for wrapping up the JavaScript in constructing the URI. The example in the book used the escape() method, which I discovered (the hard way) is wrong. For utf-8 you must use encodeURIComponent().
Few people seem to roll their own Ajax these days, but I thought I might as well add this.
About CharsetFilter
mentioned in @kosoant answer ....
There is a build in Filter
in tomcat web.xml
(located at conf/web.xml
). The filter is named setCharacterEncodingFilter
and is commented by default. You can uncomment this ( Please remember to uncomment its filter-mapping
too )
Also there is no need to set jsp-config
in your web.xml
(I have test it for Tomcat 7+ )
Some time you can solve problem through MySQL Administrator wizard. In
Startup variables > Advanced >
and set Def. char Set:utf8
Maybe this config need restart MySQL.
Previous responses didn't work with my problem. It was only in production, with tomcat and apache mod_proxy_ajp. Post body lost non ascii chars by ? The problem finally was with JVM defaultCharset (US-ASCII in a default instalation: Charset dfset = Charset.defaultCharset();) so, the solution was run tomcat server with a modifier to run the JVM with UTF-8 as default charset:
JAVA_OPTS="$JAVA_OPTS -Dfile.encoding=UTF-8"
(add this line to catalina.sh and service tomcat restart)
Maybe you must also change linux system variable (edit ~/.bashrc and ~/.profile for permanent change, see https://perlgeek.de/en/article/set-up-a-clean-utf8-environment)
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8export LANGUAGE=en_US.UTF-8