Followup on The Missing Statistics in OpenX

After my previous post about the missing OpenX statistics because of crashed MySQL-tables, I got a very nice and helpful comment from one of the OpenX developers. To put it one single word: awesome. If you’re ever going to run a company and have to look after your customers (even if you release your project as open source), simply do that. People will feel that someone are looking out for them.

Anyways, as promised, this were supposed to be a follow up. We didn’t manage to get the impressions statistics back, but the missing clicks returned after repairing the tables. The tip from Arlen didn’t help either, but I have a few suggestions for how to make the script easier to use.

I were kind of perplexed about how I could give the dates for the time interval it was going to rebuild the statistics. The trick was to change two define()-s in the top of the code. Not very user friendly, so I made a small change to use $argc and $argv instead. That way I could do:

    php regenerateAdServerStatistics.php "2008-06-01 10:00:00" "2008-06-01 10:59:59"

instead of having to edit the file and changing the defines every time. After doing this simple change, I could also write a small helper script that ran the regenerateAdServerStatistics.php file for all the operation intervals within the larger interval (an operation interval is an hour, while my interval were several days).

So, here it is, regenerateForPeriod.php:

 ");
    }

    $start = $argv[1];
    $end = $argv[2];

    $start_ts = strtotime($start);
    $end_ts = strtotime($end);

    if (!$start_ts || !$end_ts || ($start_ts >= $end_ts))
    {
        exit("Invalid dates.");
    }

    $current_ts = mktime(date('H', $start_ts), 0, 0, date('m', $start_ts), date('d', $start_ts), date('y', $start_ts));

    while($current_ts < $end_ts)
    {
        system('php regenerateAdServerStatistics.php "' . date('Y-m-d H', $current_ts) . ':00:00" "' . date('Y-m-d H', $current_ts) . ':59:59"');
        $current_ts += 3600;
    }
?>

This runs the renegerateAdServerStatistics.php script for each operation interval. If your ad server uses a larger interval than 3600 seconds, change the setting to a more appropriate value. Before doing this, you’ll want to remove the sleep(10) and the warning in regenerateAdServerStatistics.php, so that you don’t have to wait 10 seconds for each invocation of the script. I removed the warning and sleep altogheter, but hopefully someone will commit a command line parameter to regenerateAdServerStatistics.php that removes the delay. I didn’t have time to clean up the code and submit an official patch today, but if there is interest, leave a comment and I’ll consider it.

12 thoughts on “Followup on The Missing Statistics in OpenX”

  1. Thanks for this. Probable small error in the code:

    date(‘h’) should presumably be date(‘h’, $start_ts)

  2. Good catch. It’s a while since I wrote this, but I think you’re right. I’ve updated the code above to reflect this. Thanks!

  3. The file was created in the same directory as regenerateAdServerStatistics.php which is part of the standard OpenX distribution.

    If you want to keep the file in another location you have to change the path in the system() call.

  4. Running the script for 95 000 000 rows in phpads_data_raw_ad_impression takes a while… =)

  5. Neat solution – you should definitely submit to the OpenX team for consideration if it’s not already been done, the frequency of table crashes with OpenX/MySQL makes it worthwhile having this tool to hand so that recovery is as quick as possible..

  6. Haven’t submitted any patches to OpenX, but if I get some spare time I’ll try to get one together later. Feel free to submit the change and claim the credit, tho. :-)

  7. It is a piece of crap. Has been, is, and will be. These guys operate in some kind of geek fog with no perception of the end user. They don’t even realize that Openx is pure feces. Time is of no value to them, because it is our time. Better scripts come from high school kids.

  8. Hey guys,

    I know this is a bit late answer but I couldn’t solve the problem with these scripts and my problem wasn’t a new admin account.

    I couldn’t find the problem but the solution for me was going in to ox_data_summary_ad_hourly table and filter the results for the banner id.

    Then I could export that data to excel and send it to the customer. Fastest way I could find. :)

Leave a Reply

Your email address will not be published. Required fields are marked *