Making Solr Requests with urllib2 in Python

When making XML requests to Solr (A fulltext document search engine) for indexing, committing, updating or deleting documents, the request is submitted as an HTTP POST containg an XML document to the server. urllib2 supports submitting POST data by using the second parameter to the urlopen() call:

f = urllib2.urlopen("http://example.com/", "key=value")

The first attempt involved simply adding the XML data as the second parameter, but that made the Solr Webapp return a “400 – Bad Request” error. The reason for Solr barfing is that the urlopen() function sets the Content-Type to application/x-www-form-urlencoded. We can solve this by changing the Content-Type header:

solrReq = urllib2.Request(updateURL, '')
solrReq.add_header("Content-Type", "text/xml")
solrPoster = urllib2.urlopen(solrReq)
response = solrPoster.read()
solrPoster.close()

Other XML-based Solr requests, such as adding and removing documents from the index, will also work by changing the Content-Type header.

The same code will also allow you to use urllib to submit SOAP, XML-RPC-requests and use other protocols that require you to change the complete POST body of the request.