Keeping Your Code in Check: Part 1 – DRY

DRY – Don’t Repeat Yourself – is a central principle for everyone who wants to keep their code maintainable and in a clean and pristine state. It will save you from tedious tasks in the future, although it requires you to do a bit more thinking up front. The clue is to recognize when you’re about to commit the first programming sin: repeating parts of code.

Lets first be completely clear: not repeating code does not mean your code should be as general as possible. Doing that will only end up in a horrible mess that your only hope of selling is to label it “Enterprise” and push all the configuration options into XML. If your way to success is to craft a new buzzword and get the consulting business to sell your product, that might be how you do it. If you’re interesting in writing maintainable and solid software that other people can play around with and get the grip of quite quickly; don’t.

Keep it specialized to do what you need it do easily, but keep it general enough to allow you to reuse it for similar tasks.

The reason why I’m posting this now is unsurprisingly enough that I ran into this issue while coding a library at work. This issue is so common that any programmer will run into it several times a day, but the issue at hand today illustrates the problem perfectly. In a class that I was writing to return a set of data to the user, I decided to add three methods for sorting the data in different manners before returning it. The problem is that the data is kept in different array keys, and it’s the arrays themselves I want to sort. Example:

['displayValue' => 'Aloha Beach', 'count' => 13, 'value' => 'aloha']
['displayValue' => 'Norway', count => 3, 'value' => 'norway']
['displayValue' => 'Sweden Swapparoo', count => 7, 'value' => 'sweden']

To sort this in PHP, you’d use the usort function with a callback function / method that sorts the arrays by the value of the right key. My first version was something like:

private function sortByValue($a, $b)
{
    if (!isset($a['value'], $b['value']))
    {
        if (isset($a['value']))
        {
            return -1;
        }

        if (isset($b['value']))
        {
            return 1;
        }
    
        return 0;
    }

    return strcmp($a['value'], $b['value']);
}

private function sortByCount($a, $b)
{
    if (!isset($a['count'], $b['count']))
    {
        if (isset($a['count']))
        {
            return -1;
        }

        if (isset($b['count']))
        {
            return 1;
        }
    
        return 0;
    }

    return ($b['count'] - $a['count']);
}

As I typed the second if (isset()) in the second method, I finally realized that I were sitting at work and writing the exact same function twice, with just a little twist between them in regards to which field to sort by – and how to determine the sort. One field was numeric, the other was a string. Our goal is to re-use as much as the common code from both methods without typing it up — or much more important, maintaining — it twice.

As you can see, almost the whole function is identicial, except for the key name (‘value’ vs ‘count’) and how the comparison is done (numeric vs string). We need to handle these two issues to be able to use the same function for both purposes, so we rework the two functions into one function for general use for sorting on the value of a key:

private function sortByArrayField($a, $b, $field)
{
    if (!isset($a[$field], $b[$field]))
    {
        if (isset($a[$field]))
        {
            return -1;
        }

        if (isset($b[$field]))
        {
            return 1;
        }
    
        return 0;
    }

    if (is_numeric($a[$field]) && is_numeric($b[$field]))
    {
        return ($b[$field] - $a[$field]);
    }

    return strcmp($a[$field], $b[$field]);
}

This way we handle both the comparison method (if both values are numeric, we do the comparison as a numeric value by simply subtracting the first from the second) and the field to sort (as a third parameter).

Sadly usort does not provide that third parameter for us automagically, but by creating a few simple helper methods for using our refactored function, we get all “configuration” set in those three methods. The code base only contains one implementation of the method itself, but several ways to use it. Example of these three helper methods:

private function sortByValue($a, $b)
{
    return $this->sortByArrayField($a, $b, 'value');
}

private function sortByCount($a, $b)
{
    return $this->sortByArrayField($a, $b, 'count');
}

private function sortByDisplayValue($a, $b)
{
    return $this->sortByArrayField($a, $b, 'displayValue');
}

If someone now comes along and decides to add another column to our array, such as ‘price’, we can simply add another sorting callback to accomodate this:

private function sortByPrice($a, $b)
{
    return $this->sortByArrayField($a, $b, 'price');
}

The sortByArrayField method is already well proven and tried from our previous usage, and by simply changing the field we’re sorting by, we still get the power of the callback, get a completely new sort criteria and the maintainability of just one method.