Solving UTF-8 Problems With Solr and Tomcat

Came across an issue with searching for UTF-8 characters in Solr today; the search worked just as it should (probably since we’re using a phonetic field to search), but our facets and limitations didn’t work as they should. This happened as soon as we had a value with an UTF-8 character (> 127 in ascii value), in our case the norwegian letters Æ, Ø or Å.

The solution was presented by Charlie Jackson at the Solr-user mailing list and is quite simply to add URIEncoding="UTF-8" to the appropriate connector in the Tomcat server.xml file. This is also documented on the Solr on Tomcat page in the Solr Wiki .

Tags: , , ,

3 Responses to “Solving UTF-8 Problems With Solr and Tomcat”

  1. janpjens Says:

    Thanks for being a year and a half in front of me AGAIN Mats. And thanks for providing the solution trough google and your blog when away :-)

  2. Dmitri Says:

    Thanks! It works!

  3. Niranjan Says:

    Thanks alot!!

Leave a Reply