Getting Dots to Work in PHP and GET / POST / COOKIE Variable Names

One of the oldest and ugliest relics of the register_globals era of PHP are the fact that all dots in request variable names gets replaced with “_”. If your variable was named “foo.bar”, PHP will serve it to you as “foo_bar”. You cannot turn this off, you cannot use extract() or parse_str() to avoid it and you’re mostly left out in the dark. Luckily the QUERY_STRING enviornment (in _SERVER if you’re running mod_php, etc) contains the raw string, and this string contains the dots.

The following “”parser”” is a work in progress and does currently not support the array syntax for keys that PHP allow, but it solves the issue for regular vars. I will try to extend this later on to do actually replicate the functionality of the regular parser.

Here’s the code. No warranties. Ugly hack. You’re warned. Leave a comment if you have any good suggestions regarding this (.. or know of an existing library doing the same..).

function http_demolish_query($queryString)
{
    $result = array();
    $segments = explode("&", $queryString);

    foreach($segments as $segment)
    {
        $parts = explode('=', $segment);

        $key = urldecode(array_shift($parts));
        $value = null;

        if ($parts)
        {
            $value = urldecode(join('=', $parts));
        }

        $result[$key] = $value;
    }

    return $result;
}

(OK, that’s not the real function name, but it’s aptly named to be the nemesis of http_build_query)

Retrieving URLs in Parallel With CURL and PHP

As we’ve recently added support for querying Solr servers in parallel, one of the things we added was a simple class to allow us to query several servers at the same time. The CURL library (which has a PHP extension) even provides an abstraction layer for doing the nitty gritty work for you, as long as you keep track of the resources. The code beneath is based on examples in the documentation and a few tweaks of my own.

The code beneath is licensed under a MIT license. You can also download the file (gzipped).

class Footo_Content_Retrieve_HTTP_CURLParallel
{
    /**
     * Fetch a collection of URLs in parallell using cURL. The results are
     * returned as an associative array, with the URLs as the key and the
     * content of the URLs as the value.
     *
     * @param array $addresses An array of URLs to fetch.
     * @return array The content of each URL that we've been asked to fetch.
     **/
    public function retrieve($addresses)
    {
        $multiHandle = curl_multi_init();
        $handles = array();
        $results = array();

        foreach($addresses as $url)
        {
            $handle = curl_init($url);
            $handles[$url] = $handle;

            curl_setopt_array($handle, array(
                CURLOPT_HEADER => false,
                CURLOPT_RETURNTRANSFER => true,
            ));

            curl_multi_add_handle($multiHandle, $handle);
        }

        //execute the handles
        $result = CURLM_CALL_MULTI_PERFORM;
        $running = false;

        // set up and make any requests..
        while ($result == CURLM_CALL_MULTI_PERFORM)
        {
            $result = curl_multi_exec($multiHandle, $running);
        }

        // wait until data arrives on all sockets
        while($running && ($result == CURLM_OK))
        {
            if (curl_multi_select($multiHandle) > -1)
            {
                $result = CURLM_CALL_MULTI_PERFORM;

                // while we need to process sockets
                while ($result == CURLM_CALL_MULTI_PERFORM)
                {
                    $result = curl_multi_exec($multiHandle, $running);
                }
            }
        }

        // clean up
        foreach($handles as $url => $handle)
        {
            $results[$url] = curl_multi_getcontent($handle);

            curl_multi_remove_handle($multiHandle, $handle);
            curl_close($handle);
        }

        curl_multi_close($multiHandle);

        return $results;
    }
}

Download the file.

Escaping Characters in a Solr Query / Solr URL

We’re using our own Solr library at Derdubor at the moment, but we’ve only been using it for indexing content. The query part was never standardized in our common library as we usually used an alternative output format, but during the last days that has changed. We now have a parser for the default XML outputter and we’re also supporting facets and field queries (or constraints as they’re abstracted as in our library).

This means that we’re feeding content into the query that may contain foreign characters, in particular those who have special meaning in a Solr query. You can find the complete list of characters that need to be escaped in a SOLR or Lucene query in the Lucene manual.

To escape the characters we use this very simple and stupid PHP method:

    static public function escapeSolrValue($string)
    {
        $match = array('\\', '+', '-', '&', '|', '!', '(', ')', '{', '}', '[', ']', '^', '~', '*', '?', ':', '"', ';', ' ');
        $replace = array('\\\\', '\\+', '\\-', '\\&', '\\|', '\\!', '\\(', '\\)', '\\{', '\\}', '\\[', '\\]', '\\^', '\\~', '\\*', '\\?', '\\:', '\\"', '\\;', '\\ ');
        $string = str_replace($match, $replace, $string);

        return $string;
    }

We used a regular expression first, but the sheer amount of backslashes made it a regular .. hell … to read. So to make it easier for the persons maintaining this in the future, we went the easy to read / easy to maintain road for this one.

PHP: Fatal error: Can’t use method return value in write context

Just a quick post to help anyone struggling with this error message, as this issue gets raised from time to time on support forums.

The reason for the error is usually that you’re attempting to use empty or isset on a function instead of a variable. While it may be obvious that this doesn’t make sense for isset(), the same cannot be said for empty(). You simply meant to check if the value returned from the function was an empty value; why shouldn’t you be able to do just that?

The reason is that empty($foo) is more or less syntactic sugar for isset($foo) && $foo. When written this way you can see that the isset() part of the statement doesn’t make sense for functions. This leaves us with simply the $foo part. The solution is to actually just drop the empty() part:

Instead of:

if (empty($obj->method()))
{
}

Simply drop the empty construct:

if ($obj->method())
{
}

Missed Schedule for Posts in WordPress

As I started queuing the posts for the previous run of “Ready for 2010“-articles, I came across a problem with my WordPress installation. The scheduled articles didn’t show up when they were scheduled, and the only thing shown in the WordPress administration interface were a message about “Missed Schedule”. No shit, sherlock.

The reason behind the message is that the wp-cron.php file didn’t run as it should. WordPress usually tries to run this every now and then by inserting a reference to the file through the web site. Apparently this behavior was borked on my blog. I have a perfectly working cron implementation on my server, so instead of relying on WordPress to do some kind of magic to insert a reference to the file and kick off the processing with a web request, I added a reference to wp-cron.php in my usual crontab.

I have no idea how often wp-cron really should be run, but decided that a five minute resolution was enough for my use. The crontab entry is included here:

*/5 * * * * cd <directory of blog> && php wp-cron.php

This runs the cron script from the proper directory, and seems to work fine.

Writing a Munin Plugin

I have to admit something. I’ve become addicted.

One of the things I finally got around to doing while living the quiet life over the christmas holiday was to dive a bit further into Munin – a simple framework for collecting information from your computers and servers and making nice graphs that you can watch while you’re bored.

I’m not going to write a lot about how you can create your own Munin plugin to create your own graphs, as they have a very simple tutorial giving you all the basics about writing Munin plugins themselves. The only thing you need to remember are these two tidbits:

  1. When Munin first registers your plugin, it runs your script with config as the only argument. This provides Munin with the name of the graph, the labels and names (keys) of the graphs you’re providing values for, information about the axis, etc.
  2. When Munin runs your script without the config argument, it expects you to give it values for the keys you provided it in the configuration.

You enable and disable plugins by creating symlinks in /etc/munin/plugins (at least under debian / ubuntu), and plugins are usually stored in /usr/share/munin/plugins.

I keep my plugins archived together with the rest of the repository for my web projects, and then either symlink the content into the plugins-directory or create a simple wrapper script that changes the current directory to the location of the script and then invokes it (to make the current working directory be correct).

A very simple bash script that does this – and passes through any parameters given to the script:

#!/bin/bash
cd  && php ./