One of the finals steps in my current Solr adventure was to make it possible to remove a large number of documents form the index at the same time. As we’re currently using Solr to store phone information, we may have to remove several thousand records in one large update. The examples on the Solr Wiki shows how to remove one single document by posting a simple XML-document, or remove something by query. I would rather avoid beating our solr server with 300k of single delete requests, so I tried the obvious tactics with submitting several id’s in one document, making several <delete>-elements in one document etc, but nothing worked as I wanted it to.
After a bit of searching and stumbling around with Google, I finally found this very useful tip from Erik Hatcher. The clue is to simply rewrite the delete request as a delete by query, and then submit all the id’s to be removed as a simple OR query. On our development machine, Solr removed 1000 documents in somewhere around 900ms. Needless to say, that’s more than fast enough and solved our problem.
To sum it up; write a delete-by-query-statement as:
id:(123123 OR 13371337 OR 42424242 .. )
Thanks intarwebs!
Hi.. Itried using this delete by query to delete multiple documents in solr. When i pass single id its being deleted but when i pass multiple ids it throws an exception http response 505 version not supported. What am i do wrong??
This is the code snippet..
String urlString = SOLR_INDEX_URL + “/update?stream.body=id:(” + qryString + “) &commit=true”;
URL url = new URL(urlString);
URLConnection connection = url.openConnection();
connection.connect();
// Get the response
BufferedReader rd = new BufferedReader(new InputStreamReader(connection.getInputStream()));
while ((line = rd.readLine()) != null) {
builder.append(line);
}
rd.close();
I’m using get Metho and don’t know if that has an effect. I would also appreciate if you could guide me how to use a POST Method in Solr.
Thanks,
Jel
Hi! I’ve been on vacation for some time, so sorry about the late reply.
It’s hard to say exactly what your problem is, unless you also supply the complete URL you’re calling (and then try to open that through your browser to see the result). When it comes to using POST, you’ll have to tell the URLConnection object that you want it to perform a POST request, have a look at the setRequestMethod method. You supply the content of the POST request by writing to the OutputStream of the connection object (which you can fetch by using getOutputStream() on connection).
Hope that helps or at least points you in the right direction!
Can I confirm you added 1000 document ID’s to your delete request and it chewed through it in 900ms? Thats a long request (character #).If so, WOW. How good is solr!