Followup on The Missing Statistics in OpenX

After my previous post about the missing OpenX statistics because of crashed MySQL-tables, I got a very nice and helpful comment from one of the OpenX developers. To put it one single word: awesome. If you’re ever going to run a company and have to look after your customers (even if you release your project as open source), simply do that. People will feel that someone are looking out for them.

Anyways, as promised, this were supposed to be a follow up. We didn’t manage to get the impressions statistics back, but the missing clicks returned after repairing the tables. The tip from Arlen didn’t help either, but I have a few suggestions for how to make the script easier to use.

I were kind of perplexed about how I could give the dates for the time interval it was going to rebuild the statistics. The trick was to change two define()-s in the top of the code. Not very user friendly, so I made a small change to use $argc and $argv instead. That way I could do:

    php regenerateAdServerStatistics.php "2008-06-01 10:00:00" "2008-06-01 10:59:59"

instead of having to edit the file and changing the defines every time. After doing this simple change, I could also write a small helper script that ran the regenerateAdServerStatistics.php file for all the operation intervals within the larger interval (an operation interval is an hour, while my interval were several days).

So, here it is, regenerateForPeriod.php:

 ");
    }

    $start = $argv[1];
    $end = $argv[2];

    $start_ts = strtotime($start);
    $end_ts = strtotime($end);

    if (!$start_ts || !$end_ts || ($start_ts >= $end_ts))
    {
        exit("Invalid dates.");
    }

    $current_ts = mktime(date('H', $start_ts), 0, 0, date('m', $start_ts), date('d', $start_ts), date('y', $start_ts));

    while($current_ts < $end_ts)
    {
        system('php regenerateAdServerStatistics.php "' . date('Y-m-d H', $current_ts) . ':00:00" "' . date('Y-m-d H', $current_ts) . ':59:59"');
        $current_ts += 3600;
    }
?>

This runs the renegerateAdServerStatistics.php script for each operation interval. If your ad server uses a larger interval than 3600 seconds, change the setting to a more appropriate value. Before doing this, you’ll want to remove the sleep(10) and the warning in regenerateAdServerStatistics.php, so that you don’t have to wait 10 seconds for each invocation of the script. I removed the warning and sleep altogheter, but hopefully someone will commit a command line parameter to regenerateAdServerStatistics.php that removes the delay. I didn’t have time to clean up the code and submit an official patch today, but if there is interest, leave a comment and I’ll consider it.