I spent the evening yesterday playing around a bit more with Gearman, a system for farming out tasks to workers across several servers. As my workstation at home still runs Windows, the only PHP library available is the Net_Gearman in PEAR. Net_Gearman supports tasks (something to do), sets (a collection of tasks), workers (the processes that performs the task) and clients (which requests tasks to be performed). The gearman protocol supports retrieving the current status of a task from the gearman server (which contains information about how the worker is progressing, reported by the worker itself), but Net_Gearman did not.
The reason for ‘did not’ is that I’ve created a small patchset to add the functionality to Net_Gearman. All internal methods and properties are still used as they were before, but I’ve added two helper methods for retrieving the socket connection for a particular gearman server (Net_Gearman usually just picks a random server, but we need to contact the server that’s responsible for the task) and a getStatus(server, handle) method to the Gearman Client. I’ve also added a property keeping the address of the server which were assigned the task to the Task class.
After submitting a task to be performed in the background (you do not need this to get the status for foreground tasks, as you can provide a callback to handle that), your Task object will have its handle
and server
properties set. These can be used to retrieve status information about the task later. You’ll still need to provide the possible servers to the Gearman client when creating the client (through the constructor).
Example of creating a task and retrieving the server / handle pair after starting the task:
require_once 'Net/Gearman/Client.php';
$client = new Net_Gearman_Client(array('host:4730'));
$task = new Net_Gearman_Task('Reverse', range(1,5));
$task->type = Net_Gearman_Task::JOB_BACKGROUND;
$set = new Net_Gearman_Set();
$set->addTask($task);
$client->runSet($set);
print("Status information: \n");
print($task->handle . "\n");
print($task->server . "\n");
Retrieving the status:
require_once 'Net/Gearman/Client.php';
$client = new Net_Gearman_Client(array('host:4730'));
$status = $client->getStatus('host:4730', 'H:mats-ubuntu:1');
The array returned from the getStatus() method is the same array as returned from the gearman server and contains information about the current status (numerator, denominator, finished, etc, var_dump it to get the current structure). I’ve also added the patchset to the Issue tracker for Net_Gearman at github.
The patchset (created from the current master branch at github) can be downloaded here: GearmanGetStatusSupport.tar.gz.
UPDATE: I’ve finally gotten around to creating my own fork of NET_Gearman on github too. This fork features the patch mentioned above.