Solving UTF-8 Problems With Solr and Tomcat

Came across an issue with searching for UTF-8 characters in Solr today; the search worked just as it should (probably since we’re using a phonetic field to search), but our facets and limitations didn’t work as they should. This happened as soon as we had a value with an UTF-8 character (> 127 in ascii value), in our case the norwegian letters Æ, Ø or Å.

The solution was presented by Charlie Jackson at the Solr-user mailing list and is quite simply to add URIEncoding="UTF-8" to the appropriate connector in the Tomcat server.xml file. This is also documented on the Solr on Tomcat page in the Solr Wiki .

3 thoughts on “Solving UTF-8 Problems With Solr and Tomcat”

  1. Thanks for being a year and a half in front of me AGAIN Mats. And thanks for providing the solution trough google and your blog when away :-)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>