A few Keepitclose Updates

I’ve added a few new features to the Keepitclose code base. The cached version of the resource now also keeps — and reproduces — the headers from the original resource. That will allow text/xml to work as it should, in addition to keeping encoding information and other vital parts of the information contained in the HTTP headers. I’ve also added urlencode() support to the path identifier of the request, so that we’re able to handle requests which feature UTF-8 and other encodings in parts of the URL.

I have a few other features in the pipeline too, but I’m not sure if I’ll find the time to add them until after the weekend. Be sure to do a svn update if you’ve checked out the code!

Releasing Keepitclose Alpha

I’ve just created a small project site over at Google Code for Keepitclose. Keepitclose is a HTTP local caching web service written in PHP meant to cache resources retrieved from external and internal web services. It currently supports both an APC backend (using shared memory in the web server itself) and using Memcache (which allows you to have several frontends using one or several backend memcache servers).

Missing features are currently:

  • Keeping HTTP headers
  • Adding cache information to HTTP headers
  • A file based backend
  • Everything else you can think of

The server do however support:

  • Time to live (TTL) for a cached resource when stored
  • TTL Window, i.e.:

    A new version should be fetched at regular intervals, such as every sixth hour. New data are available at 00:00, 06:00, 12:00 and 18:00. Use ttlWindowSize=21600 to get six hours window size. Use ttlWindowOffset to add a offset; ttlWindowOffset=900 adds fifteen minutes to the value and retrives new content at 00:15, 06:15, 12:15 and 18:15.

If you want to implement your own backend, implement the Keepitclose_Storage_Interface and add an entry into the config.php file for your module, i.e. ‘file’ => array(‘path’ => ‘/tmp’).

The current version also supports simple access control using the allowedHosts entry in the config file. This entry contains one row for each allowed host as a regular expression:

    'allowedHosts' => array(
        '^127\\.0\\.0\\.1$',
    ),

.. will only allow requests from localhost. To allow a subnet, you could write ‘^127\\.0\\.0\\.[0-9]+$’.

You can also add several memcache servers, a request will then poll a random server to retrieve the resource. If not found, the content will be retrieved from the web and stored to all memcache servers. This is useful for environments with a very heavy read load.

'storage' => array(
    'memcache' => array(
        array(
            'host' => 'tcp://127.0.0.1',
        ),
    ),
),

To add more memcache servers, simply add another array() of memcache entries:

'memcache' => array(
    array(
        'host' => 'tcp://127.0.0.1',
    ),
    array(
        'host' => 'tcp://127.0.0.2',
    ),
),

You can also set ‘port’ and ‘persistent’ for the memcache connections.

Please leave a comment here or create an issue ticket on the google code page if you need any help or have any suggestions. All patches are welcome.