JSP's
- With the JSP
@page
directive you can specify the desired encoding by specifying both the page encoding and the content type:<%@ page pageEncoding="UTF-8" contentType="text/html;charset=UTF-8" language="java" %>
pageEncoding
specifies in which encoding the jsp page has been saved.contentType
defines what content type should be sent in the response to the browser. - It is further recommended to provide the content type through the
meta
-tag within thehead
-tag of the HTML-document:<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
...
</head>
...
</html> - To be complete, you can also specify the
@charset
directive at the top of every external css page you are using:@charset "utf-8";
...
Servlets
- With every request you get in a Servlet, you'll have to set the encoding on the Request object:
public void doXXXX(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
Beware that if you use a filter that already reads from the Request, you will need to set the character encoding in the filter.
request.setCharacterEncoding("UTF-8");
...
} - At the same time, when you build your response, for example to return XML or JSON data, you set the content type of the Response object:
response.setContentType("application/json; charset=UTF-8");
As you probably know, this response header has to be set before you start writing your data.
Database
- MySQL supports Unicode as of version 4.1 and by that it is possible to store data in de database in UTF-8 encoding. Activating the UTF-8 encoding on a MySQL table is done during its creation by specifying a
CHARACTER SET
and aCOLLATION
:CREATE TABLE `USER` (
For more information about character sets in MySQL, you can read the document Character Set Support.
...
) ENGINE=MyISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci; - At least your data is now stored in UTF-8, but it doesn't end there. In Glassfish, you still have to create a JDBC Connection Pool with the correct settings allowing the JDBC driver to actually read and write your data in UTF-8. In the Admin Console you select the desired Connection Pool and then you navigate to Additional Properties. You will already see a number of properties being filled in (like DatabaseName, url, username and password). To enable UTF-8 support for JDBC, you'll have to add two extra properties:
useUnicode = true
characterEncoding = utf8
- To be able to send e-mail messages encoded in UTF-8, you will also need to provide the encoding type on the subject and the content of the message. Finally, you also need to set the content type of the e-mail message itself.
MimeMessage msg = new MimeMessage(session);
msg.setSubject(subject, "UTF-8");
msg.setText(body, "UTF-8");
if (asHtml) {
msg.setContent(mailMessage.getBody(), "text/html; charset=UTF-8");
msg.setHeader("Content-Type", "text/html; charset=UTF-8");
} else {
msg.setHeader("Content-Type", "text/plain; charset=UTF-8");
}
And that's it. You should now have a website that doesn't give problems showing and storing your UTF-8 content. Below you can see a screenshot of my web application that shows data in Georgian (username), in Japanese (Location) and even in Runic alphabet (Tags):