Python, httplib and Empty Content for 200/201 Responses

While hacking together a client for Imbo in python, I weren’t able to read the response from a connection initiated with httplib. If the request errored out (http response code 400/403/404) everything worked as it should, but if the response code were 200 / 201, the response read from the httplib connection was empty (read by using getresponse()).

Turns out the issue was related to calling close on the connection before reading the response. This apparently works if there’s an error (which means that the response should be rather small), but not if there’s a regular “OK” response from the server (it’s not enough just retrieving the HTTPResponse object, you have to call read() on it before closing the connection).

connection.request(method, path, data)
data = connection.getresponse().read()
connection.close()

(Compared to the previous solution which retrieve the HTTPResponse object, closed the connection and then read the response)

Evolution & Exchange: Unable to retrieve message

Some time after upgrading to Ubuntu 11.10 I ended up with the dreaded “Unable to retrieve message” in Evolution (which I use for Exchange connectivity). This has usually corrected itself by simply restarting Evolution, but this time nothing would help. I stumbled across a thread that provided a few ways to possibly solve the issue, but the .evolution directory didn’t contain any live installation in Ubuntu.

Turns out the directory is:

.local/share/evolution

As both my mailstore and address book lives on the Exchange server, I decided to just move the evolution directory to a new name and recreate the evolution directory from scratch. This takes a bit of time while Evolution indexes everything, but after a while everything were back to normal.

Solr Response Empty from PHP, but Works in Browser or CURL?

Weird issue that I think I’ve stumbled upon earlier, but yet again reared it’s head yesterday. Certain application containers (possibly Jetty in this case) will for some reason not produce any output from Solr (or other applications I’d guess) if the request is made with HTTP/1.0 as the version identifier (“GET /…/ HTTP/1.0” as the first line of the request). The native HTTP support in PHP identifies itself as HTTP/1.0 as it doesn’t support request chunking, which then turns into a magical problem with requests that used to work, but doesn’t work any longer (the response is just zero bytes in size – all other headers are identical) – but still works as expected if you open them in your browser.

The solution is to either gamble on the server not sending any chunked responses and then setting protocol_version in the stream context that you pass to the file retrieving function (the list of HTTP wrapper settings (.. I don’t think it’s a good idea to define protocol_version as float, but .. well.)), or use cURL instead. The Solr pecl extension uses cURL internally, so it’s not affected by this issue.

E:Error, pkgProblemResolver::Resolve generated breaks

While attempting to upgrade to Ubuntu 11.10 (Oneiric) from 11.04, do-release-upgrade refused to do anything useful. The only message it felt like delivering was “E:Error, pkgProblemResolver::Resolve generated breaks”. Googling didn’t turn up much, but a forum thread (which I seem to have lost now) suggested (among other attempts) to remove any references to external (3rd party) APT repositories. I thought do-release-upgrade did this by itself, but apparently not …

Commenting out the external repositories in /etc/apt/sources.list and in /etc/apt/sources.list.d/* solved the problem (I had spotify, dropbox and Google Chrome there), allowing do-release-upgrade to do its thing.

Checking Status of a Background Task in python-gearman

After stumbling over a question on stackoverflow about how you’d use python-gearman for checking the current status of a running background task, I decided to dig a bit deeper into python-gearman and .. well, answer how you’d do just that.

It turns out it wasn’t as straight forward as it should have been, but at least I managed to solve it by using the current API. First of all you’ll have to keep track of which Gearman server gets your task, and what handle it has assigned to the task. These two values identify a current running task, and since the identifiers (handle) isn’t globally unique, you’ll also have to keep track of the current server (so you know where to ask).

To request the current status of a long running task you’ll have to create appropriate instances of the GearmanJob and GearmanJobRequest yourself.

Here’s a small example of how you can do this:

import gearman
    
client = gearman.GearmanClient(['localhost'])
result = client.submit_job('reverse', 'this is a string', background=True);

The connection information is available through result.job.connection (.gearman_host and .gearman_port), while the handle is available through result.job.handle.

To check the status of a currently running job you create a GearmanClient, but only supply the server you want to query for the current state:

client = gearman.GearmanClient(['localhost'])

# configure the job to request status for - the last four is not needed for Status requests.
j = gearman.job.GearmanJob(client.connection_list[0], result.job.handle, None, None, None, None)

# create a job request 
jr = gearman.job.GearmanJobRequest(j)
jr.state = 'CREATED'

# request the state from gearmand
res = client.get_job_status(jr)

# the res structure should now be filled with the status information about the task
print(str(res.status.numerator) + " / " + str(res.status.denominator))

That should at least solve the problem until python-gearman gets an easier API to do these kinds of requests.

Update: I’ve also added a convenience function to my python-gearman fork at github.