Using Solrj – A short guide to getting started with Solrj

As Solrj – The Java Interface for Solr – is slated for being released together with Solr 1.3, it’s time to take a closer look! Solrj is the preferred, easiest way of talking to a Solr server from Java (unless you’re using Embedded Solr). This way you get everything in a neat little package, and can avoid parsing and working with XML etc directly. Everything is tucked neatly away under a few classes, and since the web generally lacks a good example of how to use SolrJ, I’m going to share a small class I wrote for testing the data we were indexing at work. As Solr 1.2 is the currently most recent version available at apache.org, you’ll have to take a look at the Apache Solr Nightly Builds website and download the latest version. The documentation is also contained in the archive, so if you’re going to do any serious solrj development, this is the place to do it.

Oh well, enough of that, let’s cut to the chase. We start by creating a CommonsHttpSolrServer instance, which we provide with the URL of our Solr server as the only argument in the constructor. You may also provide your own parsers, but I’ll leave that for those who need it. I don’t. By default your Solr-installation is running on port 8080 and under the solr directory, but you’ll have to accomodate your own setup here. I’ve included the complete source file for download.

  1. class SolrjTest
  2. {
  3.     public void query(String q)
  4.     {
  5.         CommonsHttpSolrServer server = null;
  6.  
  7.         try
  8.         {
  9.             server = new CommonsHttpSolrServer("http://localhost:8080/solr/");
  10.         }
  11.         catch(Exception e)
  12.         {
  13.             e.printStackTrace();
  14.         }

The next thing we’re going to do is to actually create the query we’re about to ask the Solr server about, and this means building a SolrQuery object. We simply instanciate the object and then start to set the query values to what we’re looking for. The setQueryType call can be dropped to use the default QueryType-handler, but as we currently use dismax, this is what I’ve used here. You can then also turn on Facet-ing (to create navigators/facets) and add the fields you want for those.

  1.         SolrQuery query = new SolrQuery();
  2.         query.setQuery(q);
  3.         query.setQueryType("dismax");
  4.         query.setFacet(true);
  5.         query.addFacetField("firstname");
  6.         query.addFacetField("lastname");
  7.         query.setFacetMinCount(2);
  8.         query.setIncludeScore(true);

Then we simply query the server by calling server.query, which takes our parameters, build the query URL, sends it to the server and parses the response for us.

  1.         try
  2.         {
  3.             QueryResponse qr = server.query(query);

This result can then be fetched by calling .getResults(); on the QueryResponse object; qr.

  1.             SolrDocumentList sdl = qr.getResults();

We then output the information fetched in the query. You can change this to print all fields or other stuff, but as this is a simple application for searching a database of names, we just collect the first and last name of each entry and print them out. Before we do that, we print a small header containing information about the query, such as the number of elements found and which element we started on.

  1.             System.out.println("Found: " + sdl.getNumFound());
  2.             System.out.println("Start: " + sdl.getStart());
  3.             System.out.println("Max Score: " + sdl.getMaxScore());
  4.             System.out.println("——————————–");
  5.  
  6.             ArrayList<HashMap<String, Object>> hitsOnPage = new ArrayList<HashMap<String, Object>>();
  7.  
  8.             for(SolrDocument d : sdl)
  9.             {
  10.                 HashMap<String, Object> values = new HashMap<String, Object>();
  11.  
  12.                 for(Iterator<Map.Entry<String, Object>> i = d.iterator(); i.hasNext(); )
  13.                     Map.Entry<String, Object> e2 = i.next();
  14.                     values.put(e2.getKey(), e2.getValue());
  15.                 }
  16.  
  17.                 hitsOnPage.add(values);
  18.                 System.out.println(values.get("displayname") + " (" + values.get("displayphone") + ")");
  19.             }

After this we output the facets and their information, just so you can see how you’d go about fetching this information from Solr too:

  1.             List<FacetField> facets = qr.getFacetFields();
  2.  
  3.             for(FacetField facet : facets)
  4.             {
  5.                 List<FacetField.Count> facetEntries = facet.getValues();
  6.  
  7.                 for(FacetField.Count fcount : facetEntries)
  8.                 {
  9.                     System.out.println(fcount.getName() + ": " + fcount.getCount());
  10.                 }
  11.             }
  12.         }
  13.         catch (SolrServerException e)
  14.         {
  15.             e.printStackTrace();
  16.         }
  17.     }
  18.  
  19.     public static void main(String[] args)
  20.     {
  21.         SolrjTest solrj = new SolrjTest();
  22.         solrj.query(args[0]);
  23.     }
  24. }

And there you have it, a very simple application to just test the interface against Solr. You’ll need to add the jar-files from the lib/-directory in the solrj archive (and from the solr library itself) to compile and run the example.

Download: SolrTest.java

Tags: , , ,

35 Responses to “Using Solrj – A short guide to getting started with Solrj”

  1. Justin Beck Says:

    This is great, thanks for posting this… I don’t suppose you know how to add content to the index… I’m digging through the API but there is no documentation (until the release of 1.3 I presume).

    Thanks again…

  2. Mats Says:

    Adding content to the index can be performed by simply POST-ing a suitable XML document to the index by using a regular HTTP POST. You can see this in the regular Solr Tutorial: http://lucene.apache.org/solr/tutorial.html

    If you want to use SolrJ for this, there is a very, very simple example in the Solrj Wiki now, check out:

    http://wiki.apache.org/solr/Solrj#head-0adf51b414cbf44c692bcadad4b12326df56d298

  3. Joel Says:

    Nice guide! I have a newbie question though; how do I add the required .jar files when I compile?

  4. Mats Says:

    Simply provide them together with the -classpath directive to javac and java, or set the CLASSPATH environment variable.

    If you’re using an IDE like Netbeans or Eclipse, you can add the libraries by right clicking on your project and selecting add -> library (or something like that, it’s been a while since I added things manually).

    Hope that helps!

  5. Mario Says:

    Mats,

    Thanks for the tutorial!

    Question: are the CommonsHttpSolrServer and the EmbeddedSolarServer classes thread-safe?

  6. Mats Says:

    The CommonsHttpSolrServer class represents a client connection to the server, so that should be thread-safe. I have no experience with the EmbeddedSolrServer class, so I’d suggest you post that question to the Solr development list or do a Google search for the issue instead.

    If my memory serves me right (which it very well may not do), the EmbeddedSolrServer is considered to be a inferior way of running Solr compared to the full stack.

  7. aida Says:

    Hi All,

    I want to index the document fields in a xml file to index using solrj. I
    know how to index the document fields using doc.addfield(). But I dont know
    how to post the xml document instead of adding each field in solrj.

    Can I index xml file using solrj? Can anyone help me in how to do this?

    Thanks,

  8. Sergey Says:

    Very helpful tutorial!

    Trying to follow it, I wrote a small app that uses Solr through Solrj. Everything works fine except for the fact that I don’t get the results I expect. :) Probably because I can’t find where the indexed data is kept. The Solr documentation says that it should go the the solr/data directory which is made automatically by Solr. But it’s not there.

    Does anybody know the answer?

    Thanks a lot.

  9. Mats Says:

    @Sergey: Remember to commit after adding the documents, otherwise they will not be added to the index.

    @aida: To index xml-files directly, just submit the XML documents through a regular POST operation to the /update-handler. This is what solrj does in the background for you.

  10. Mark Bennett Says:

    A very helpful tutorial, thanks!

    BTW, I think this line should be tweaked:
    List facetEntries = facet.getValues();

    To:
    List facetEntries = facet.getValues();

    At least with my compiler setup I was getting a warning about missing semicolon in the first version.

  11. jordi Says:

    Hi,
    Does anybody how to query a specific core with solrj ?
    I have a core0 configured but I didnt find how to query it with solrj

    Thanks

  12. William Jackson Says:

    I am trying to run a SolrJ test program. I am having problems with the Tomcat 6.0 configuration for SolrJ. Sorry for posting the Exception Trace. What does it mean:

    org.apache.solr.client.solrj.SolrServerException: Error executing query
    at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:96)
    at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:109)
    at SolrJQuery.query(SolrJQuery.java:64)
    at SolrJQuery.main(SolrJQuery.java:112)
    Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.SocketTimeoutException: Read timed out
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:391)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
    at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
    … 3 more
    Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:129)
    at com.sun.net.ssl.internal.ssl.InputRecord.readFully(InputRecord.java:293)
    at com.sun.net.ssl.internal.ssl.InputRecord.read(InputRecord.java:331)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:789)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1112)
    at com.sun.net.ssl.internal.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:623)
    at com.sun.net.ssl.internal.ssl.AppOutputStream.write(AppOutputStream.java:59)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
    at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
    at org.apache.commons.httpclient.HttpConnection.flushRequestOutputStream(HttpConnection.java:828)
    at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.flushRequestOutputStream(MultiThreadedHttpConnectionManager.java:1565)
    at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2116)
    at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
    at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:335)
    … 5 more

  13. Mats Says:

    The exception indicates that the request timed out while trying to get results from the Solr server. This can be caused by the Solr server not being available, locking up or other issues. Try issuing the same query through the web interface to the solr server or use Wireshark to look at the traffic between your application and the Solr server.

  14. Andre Says:

    Hi Mats,

    My application already has the Lucene´s indexes and I want to use it with the Solr passing the path where the indexes are stored. How can I do this?

    Thanks a lot.
    André

  15. Mats Says:

    Read about how to use an existing Lucene index in Solr at the solr-user mailinglist. Hopefully that helps!

  16. Kams Says:

    :)
    Hi All,
    Very helpful tutorial!
    Thanks a lot.

  17. Andre Says:

    Hi all,

    I am a newbie for using solr. I am indexing data from database using solr data import handler. My question is that once everything is indexed how can i query using any keyword. Using *:* it gives all the results. However if i search using a keyword that is already indexed, the search results gives nothing and even the keyword is indexed. Do i have to use solrj to add the fields in the schema.xml in a client application?

  18. Neha Says:

    Where i have to store this solrTest file???

  19. Mats Lindh Says:

    You store the file anywhere you want – as long as you’re able to find it again and compile it with javac. Usually you can do this with just javac , but it may require adding some libraries to the path if they’re not already there (for SolrJ).

  20. Ahmed Says:

    Just wanted to say that I stumbled across this post last year, and it is indeed the best example of SolrJ that I have found thus far.

    Just wondering, have you ever done any MoreLikeThis examples using SolrJ? I’m currently experimenting with it to see how far I can get…

  21. How can I use Solr since my Java application? - Quora Says:

    [...] Solr API   Dmitry Kan you can use solrj, google it. First hit for solrj tutorial: http://e-mats.org/2008/04/using-…11:42pmView All 0 CommentsCannot add comment at this time. Add [...]

  22. sumit Says:

    hey.. I am a newbie to all this.. I was using the class GeoHashFunction and it required ValueSource type objects as parameter.. but ValueSource is an abstract class.. Can you tell me how to use this ValueSOurce class and add LAT and LNG values to it?
    hoping to get a reply soon..

  23. Mats Lindh Says:

    Sorry for the late answer Sumit, but I really don’t have any experience using the GeoHashFunction. The ValueSource class hierarchy are usually implemented using any of the classes that Inherit from ValueSource. This seems to be internal stuff in Solr that you really shouldn’t have to do much work with on your own.

    I’ll get back to an article about proper geo searching through Solr later as the standard seems to have stabilised.

  24. Dean Hiller Says:

    Great article,

    Don’t suppose the example could include Paging where a cursor was maintained in solr so the second page does not have to go through all the nodes in the index tree again that the first page went through.

    ie. kind of like ScrollableResultSet from hibernate vs. Query…Query gets slower and slower the more and more pages where ScrollableResultSet stays linear.

    thanks,
    Dean

  25. On Solr « sowmyawrites …. Says:

    [...] cite this because, I had a tough time in finding an example on using SolrJ to read Solr results . Mats Lindh’s post was also a very useful one to use [...]

  26. Anil Says:

    Hi Mats,

    I want to read the data from Solr in Json format. Is there any way to directly read the Json string (instead of reading the data as Beans & then converting them to Json)?

    Thanks,
    Anil.

  27. Mats Lindh Says:

    Hi Anil,

    You can use the SolrJSON output writer to get the output from Solr directly as JSON.

    Hope that helps!

    –mats

  28. Anil Says:

    Thanks a ton Mats, for the lightning fast reply.

    From what I got from that page, it is only mentioned that if append “wt=json” to the url we will get the response as json. But my problem is how do I get the same json in my java code. Currently I am doing some thing like this.

    QueryResponse rsp = solrServer.query(query, SolrRequest.METHOD.POST);
    List results = rsp.getBeans(AbstractSolrEntity.class);
    String jsonResponse = convertBeansToJson(results);

    Instead of doing all this, is there a way to directly get the jsonResponse from the solrServer/QueryResponse?

    Thanks,
    Anil.

  29. Mats Says:

    Ah, sorry.

    No, I don’t think there’s a way of directly getting the JSONResponse through SolrJ. As far as I can see there’s no way of getting the stream to read from the query or the raw output from the query before parsing in SolrJ.

  30. Anil Says:

    When I do rsp.toString() it is giving me the result in javabin format. Following is what I am getting.

    {responseHeader={status=0,QTime=2,params={start=0,q=agencyId:1,wt=[javabin, javabin],rows=1000000,version=2}},response={numFound=1,start=0,docs=[SolrDocument[{agencyId=1, agencyName=Agency One, hostId=2, hostName=Host Two, subOrgId=3, subOrgName=SubOrg Three}]]}}

    I want the same string in Json format. Even though I have done query.set(“wt”,”json”), I am getting in this format. Am I doing anything wrong here?

  31. Eduardo Says:

    Very nice introduction, I learned a lot :)

    I will now be looking on other tutorials/references for more advanced features! Can you please give some pointers of other documentation? I did not find much about SolrJ.

    Just a quick correction on a code snippet you show on the tutorial. There is a variable name (facetEntries) out of place in:

    List facetEntries = facet.getValues();

    This is just fine in the file for download.

    Thank you! Cheers!

  32. Mats Says:

    Thank you for the update, the example has been corrected now! I don’t have any suggestions of other resources, but if there’s anything particular you want me to dig into, it’d be great to hear what people are missing. I’ve been wanting to do a part 2, but have never found the time.

  33. Eduardo Says:

    After playing a little bit more with SolrJ and also check some other documentation (for example the http://www.solrtutorial.com/, which gives a good overview on how to configure schema.xml and solrconfig.xml) I think there are not many other things to present as introduction to SolrJ.

    However, I am working with Solr for 1 week, and for sure I will encounter some other problems… When this happens I will let you know…

    Cheers!

  34. Is there a good tutorial or resource available on SolrJ? [closed] | Q&A System Says:

    [...] are numerous blogs like these which would get you began with SolrJ but no remarkable [...]

  35. bing Says:

    Hi, Mats,

    I have been using Solr for several months, but only recently I want to use SolrJ to access Solr.
    I can understand most parts in your post, and I really want to have a try of the sc you posted; probably change it somehow to cater my needs. My problem is, I am not sure how to make the sc running. Say I have downloaded apache-solr-3.5.0-src, SolrJ included, where should I put the sc in the package? Can you give some hints or description of steps? Thanks a lot.

Leave a Reply