Escaping Characters in a Solr Query / Solr URL
We’re using our own Solr library at Derdubor at the moment, but we’ve only been using it for indexing content. The query part was never standardized in our common library as we usually used an alternative output format, but during the last days that has changed. We now have a parser for the default XML outputter and we’re also supporting facets and field queries (or constraints as they’re abstracted as in our library).
This means that we’re feeding content into the query that may contain foreign characters, in particular those who have special meaning in a Solr query. You can find the complete list of characters that need to be escaped in a SOLR or Lucene query in the Lucene manual.
To escape the characters we use this very simple and stupid PHP method:
-
static public function escapeSolrValue($string)
-
{
-
$match = array('\\', '+', '-', '&', '|', '!', '(', ')', '{', '}', '[', ']', '^', '~', '*', '?', ':', '"', ';', ' ');
-
$replace = array('\\\\', '\\+', '\\-', '\\&', '\\|', '\\!', '\\(', '\\)', '\\{', '\\}', '\\[', '\\]', '\\^', '\\~', '\\*', '\\?', '\\:', '\\"', '\\;', '\\ ');
-
$string = str_replace($match, $replace, $string);
-
-
return $string;
-
}
We used a regular expression first, but the sheer amount of backslashes made it a regular .. hell … to read. So to make it easier for the persons maintaining this in the future, we went the easy to read / easy to maintain road for this one.
Tags: Apache, escaping, PHP, Solr, str_replace

January 22nd, 2010 at 22:04
You can verify this function against Solr’s Java client ClientUtils.escapeQueryChars. See http://svn.apache.org/repos/asf/lucene/solr/trunk/src/solrj/org/apache/solr/client/solrj/util/ClientUtils.java
January 23rd, 2010 at 23:44
Good point. I’ve used SolrJ quite a bit before, but I never thought about validating it against the same behaviour. SolrJ also escapes ” and ; which were missing from my list. I’ve added them now.
Thanks for the update!
January 30th, 2010 at 23:24
[...] up on the previous post about escaping values in a Solr query string, it’s important to note that you should not escape spaces in the query itself. The reason for [...]
November 5th, 2010 at 18:19
ClientUtils class is escaping the space also. For example:
Input: hello there
Expected: hello there
Actual: hello\ there
This is giving problem as the final string will become as hello\+there when sent over HTTP.
Regards,
Satish.
August 18th, 2011 at 08:10
Very helpful post.
@Shalin Shekhar Mangar Clientutils was good food too ;)
December 7th, 2011 at 14:41
This also gives bad errors for date facets
For example ..
Your method turns the query into this…
http://localhost:8080/test/select?q=fqdn\:b\*&facet=on&facet.date.start=NOW&facet.date.end=2012\-02\-05T13\:37\:29\+00\:00Z&facet.date=ending&facet.date.gap=\+7DAY&rows=25&wt=json
Making SOLR do this in the error log!
INFO: [] webapp=/test path=/select params={facet.date.start=NOW&facet=on&q=fqdn\:b\*&facet.date=ending&facet.date.gap=\+7DAY&wt=json&facet.date.end=2012\-02\-05T13\:36\:21\+00\:00Z&rows=25} hits=0 status=400 QTime=1
07-Dec-2011 13:37:29 org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: date facet ‘end’ is not a valid Date string: 2012\-02\-05T13\:37\:29\ 00\:00Z
April 13th, 2012 at 06:32
[...] our request, escaping any Solr special characters. For this, I use a function taken from this post http://e-mats.org/2010/01/escaping-characters-in-a-solr-query-solr-url/ A lot of this code is for the pager. I’m not going to explain that here, but it’s not [...]
July 3rd, 2012 at 07:41
In URL You can escape the special characters by giving ‘/’ before that.
Example: q = solr/-Indexing // Here ‘-’ is special character.
You can see more Detail Information here,
http://antguider.blogspot.com/2012/06/solr-search.html
December 5th, 2012 at 15:47
There’s an escape function in Apache_Solr_Service if you are using that to connect with in php
$string = Apache_Solr_Service::escape($string);
for phrases:
$phrase = Apache_Solr_Service::escapePhrase($phrase);
or a bit of convenience, this will create the phrase and escape it:
$phrase = Apache_Solr_Service::phrase($string);
January 17th, 2013 at 23:12
[...] programmer Mats Lindh has solved this pretty well, using str_replace. str_replace is a convenient, general-purpose string replacement function that [...]
January 17th, 2013 at 23:19
Hey Mats!
Thanks for sharing your solution. I’ve blogged about a solution to this same problem in Python that I recently faced. It’s slightly harder because python doesn’t really give you a direct equivalent to str_replace.
May 28th, 2013 at 14:11
Shalin Shekhar Mangar posted a great link to Solr’s own ClientUtils.escapeQueryChars function. The link has moved to here:
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java
For convenience, the function is here:
/**
* See: {@link org.apache.lucene.queryparser.classic queryparser syntax}
* for more information on Escaping Special Characters
*/
public static String escapeQueryChars(String s) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
// These characters are part of the query syntax and must be escaped
if (c == '\\' || c == '+' || c == '-' || c == '!' || c == '(' || c == ')' || c == ':'
|| c == '^' || c == '[' || c == ']' || c == '\"' || c == '{' || c == '}' || c == '~'
|| c == '*' || c == '?' || c == '|' || c == '&' || c == ';' || c == '/'
|| Character.isWhitespace(c)) {
sb.append('\\');
}
sb.append(c);
}
return sb.toString();
}