The Results of our Weekend Challenge

The weekend PHP size competiton I mentioned on friday has come to an end, with the results being as follows:

      CNU (253 bytes)
      Helge (261 bytes)
      Ymgve (267 bytes)
      Me (365 bytes)
      dibon&zep (which never delivered, but had working solutions)

For those who are interested in the strategies and tactics employed by the contestants, I’ve included a small write-up and analysis of the various contributions. We decided early on that using eval() and gzinflate on the content were something that everyone could apply, so the size would be counted for the decompressed code. If any contestant implemented their own decompressor in the user space code themselves, we would accept that. It would not be any advantage anyways, as the decompressor code would eat up more space than the best contributions used in total.

I’ve included the writers own comments if they had any. The biggest change in size happened when people started using one dimensional strings instead of arrays.

CNU’s contribution

$b='';foreach(file('maps.txt',2)as$l){if(!$l){$q[1]=strpos($b,'.');for($b=str_split($b);$p=&$q[++$l];)foreach(array($p-$w,$p+$w,$p-1,$p+1)as$x){$c=$b[$p]+1;$d=&$b[$x];if($d=='X')echo$c.'
';if($d==' ')$q[]=$x;$d=$c;}$q=$b='';}$b.=$l;$w=strlen($l);}

Helge’s contribution

$m="";foreach(file("maps.txt")as$l){if($l!="
"&&($m.=$l)+$w=strlen($l))continue;for($y=$d[$v[]=strpos($m,'.')]=0;strpos($m,'X')-$u=$v[$y++];)for($x=-$w;$x<=$w;$x+=$x+1?$w-1:2)in_array($n=$u+$x,$v)|$m[$n]=="#"||$d[$v[]=$n]=$d[$u]+1;echo"$d[$u]
";$m=$v="";}

Helge also included an annotated version with comments in norwegian:

/** 
 * $l = én linje (av maps.txt)
 * $m = ett map
 * $w = vidden til et kart
 * $d = distansen til et vilkårlig punkt
 * $v = open/closed-array (putter både squares 
        of interest og visited squares her)
 * $u = "current square", utgangspunktet til å 
        generere naboer
 * $n = nabo
 * $x = hjelpevariabel for å generere $n
 * $y = hjelpevariabel for å generere $u
 */


/* Klarte aldri å finne en måte å skippe initialisering av 
   $m her.. (den resettes på siste linja og.. føles waste)
   * En idé var å bruke to for(;;)'s istedenfor en foreach 
   og continue, men jeg tror jeg hadde økt i kode da.. */
$m="";

/* Foreach var nyttig fant jeg ut.. den stopper når filen er 
   ferdig lest av seg selv, så slipper kode på sjekking av det, 
   $f[$y++], osv. */
foreach(file("maps.txt") as $l) {
    if(
        /* Når $l=="\n" (som betyr slutten på et kart) skal 
           vi slutte å appende til $m og heller breake if'en 
           så vi kan starte pathfinding */
        $l != "\n" &&

        /* Bruker en 1-dimensjonal string istedenfor array 
           (= win!) */
        ($m .= $l) +

        /* Trenger vidden når jeg skal regne ut naboene til et 
           gitt felt. */
        $w = strlen($l)
    )
        /* Gjør så $l blir feedet en ny linje av kartet vårt til 
           kartet er ferdig */
        continue;

    /* Pathfinding start! */
    for(
        /* Setter $y og $d[starten] lik 0, og putter startposisjon 
           inn i $v i samma slengen. */
        $y = $d[ $v[] = strpos($m, '.') ] = 0;

        /**
         * Gjør så for(;;) breaker når sluttposisjon-$u blir 0 
         * (synonymt med $u == strpos($m, 'X');
         * Tar også neste $u fra $v i samma slengen */
        strpos($m, 'X') - $u = $v[ $y++ ];
        
    )
        /* For hver nabo */
        for(
            /* Genererer naboer.
             * Den starter på -width (som blir opp), går så til -1 
               (som blir venstre), 1 (som blir høyre) og tilslutt 
               width (som blir ned) */
            $x = -$w;
            $x <= $w;
            $x += $x+1 ? $w-1 : 2
        )
            /* Om naboen (summen av $u og $x) er i $v: continue */
            in_array($n = $u+$x, $v) | /* fancy bitwise or som kan 
                                          brukes fordi det er snakk 
                                          om 2 booleans */

            /* Om naboen er en '#' på kartet: continue */
            $m[$n] == "#" ||

            /* Hvis ikke, sett distansen til naboen lik distansen 
               til $u + 1 og putt naboen inn i $v */
            $d[ $v[] = $n ] = $d[$u]+1;

    /* Skriv ut distansen til $u (skal være sluttpunktet nå) */
    echo "$d[$u]\n";

    /* Disse må resettes.. */
    $m=$v="";
}

Ymgve's contribution

foreach(file("maps.txt")as$l){for($i=0;-75<$e=ord($l[$i++])-88;$m[]=$e+56?$e:1e6,$z=array(-1,1,-$i,$i))$e+42||$q[0]=count($m);if($i<3){for($d=$k=$c=0;;$c++-$k||$i++&$k=$d)for($j=4;$j;$g>$i&&($g=$i)&$q[++$d]=$p)if(!$g=&$m[$p=$q[$c]+$z[--$j]])break 2;echo"$i
";}}

My own contribution

Breadth-first Search, Delivered Version, 365 bytes:


Prettyprinted:


This is mostly your usual run-of-the-mill BFS, where the search itself is implemented as a simple queue.

We loop through all the lines in the file (observe that you can remove the space character on both sides of as, saving you two bytes!):

foreach(file('maps.txt')as$l)

The part within the else condition builds the map to search for the exit point:

    // for each line in the file, we do this
    for($x=0;$l[$x++]!="
";$m[$y][$x]=$l[$x]=="#"?0:$l[$x])
        if($l[$x]=='.')$q[]=array($y,$x,0);

The for() loops through each character on the line, until it hits the newline marker at the end. Instead of entering "\n", you simply use the actual newline. This way you save another byte. The map ($m) is then populated using the current y,x coordinate (y being the line, x being the character on that line).

$m[$y][$x]=$l[$x]=="#"?0:$l[$x] can be expanded to:

if ($l[$x] == "#")
{
    $m[$y][$x] = 0;
}
else
{
    $m[$y][$x] = $l[$x];
}

Meaning that if the character is a wall, we store the value zero; if any other value, we store the character itself. We'll use this attribute when searching to determine when we've reached our goal. All of this is kept inside the for()-construct itself. The only thing contained in the for()-loop, is the statement if($l[$x]=='.')$q[]=array($y,$x,0);. This checks if the current character is the starting point, and if that's the case, adds it to our queue of points to check (making it the starting point of our BFS).

When we hit an empty line in the file (through the if("
"==$l) check), we execute our BFS.

    /* while $q is valid (this handles empty labyrinths), we fetch the 
        current y and x coordinates, in addition to d, the distance to 
        the location we're fetching from the queue. No {} here, as the 
        for only contains the if() statement below. */
    for(;$q&&list($y,$x,$d)=array_shift($q);)

        /* if this spot has a value (remember that we assigned 0 to walls, 
            while we kept the character value for other fields when we 
            parsed the file. We use empty() as the coordinates may not be 
            valid indexes into the array, as we don't do any checks when 
            we add them. one call to empty() is better than four ifs and 
            duplicate array references). */
        if(!empty($m[$y][$x]))
        {
            // create a reference (saves one byte instead of referencing $m[$y][$x] twice.
            $k=&$m[$y][$x];

            /* if we've found the X, echo the distance to the X. Under any 
                under circumstance, you'd break; out of the for/while here, 
                but as bytes count more than speed, we save six bytes by 
                searching the whole labyrinth instead of ending early. Funny 
                detail: as echo is a language construct, you can skip the space. 
                If the character is not 'X', we simply echo an empty string 
                instead of using a conditional echo. Saves a couple of bytes. */
            echo$k=='X'?"$d
":'';

            // unset the value, so that we don't visit this node again.
            $k=0;

            /* add all neighbours to our queue. We pre-increment $d, instead 
                of adding +1 in all the adds. */
            $q[]=array($y,$x-1,++$d);
            $q[]=array($y-1,$x,$d);
            $q[]=array($y,$x+1,$d);
            $q[]=array($y+1,$x,$d);
        }
        
        /* Reset the values. PHP generates a warning if a value has never
            been set before when used, but will happily accept null values 
            as empty arrays or 0. The double assignment is also a trick worth 
            remembering to save bytes. */
        $m=$y=null;

Some of tricks used in the competition

  • Multi-variable assignment: $a=$b=0;
  • Actual newline when detecting a newline: if ($a=="
    "){}
  • Using null to be able to reset both integers and arrays in one statement.
  • The more code you can fit into a for() statement, the better. You get three "free" line endings by doing this. for(;;) is the same length as while(), but you can get several statements evaluated.
  • You can skip the spaces in foreach().
  • Create references to avoid using long array indices several times.
  • The ternary operator (?:) is your friend!
  • Remember that {}'s can be dropped when you have a single statement that contains all subsequent code (as long as you don't get any conflicts with else etc. on different levels):
    for(;;) <-- no { here.
    if ()
    {

    }

  • Simulating arrays with strings

The scripts were run with a set of pre-made labyrinths and a collection of 20 random labyrinths (more like maps, but they still test the code the same). The script for creating the labyrinths is available here. All delivered contributions passed all tests.

Translating Drizzle to Norwegian

As Monty asked for help with translations of the current strings available in Drizzle on his blog yesterday, I sat down a couple of hours yesterday and a couple of hours today to at least attempt to contribute something to the project. As my primary language is Norwegian and I have some experience writing, I decided to tackle the Norwegian (Bokmål, not Nynorsk) translation of Drizzle. I’ve currently finished the 358 available messages, but I’d really appreciate it if someone spent a couple of minutes / hours to read through them and confirm that my assumptions are sane.

The most troubling part when it comes to definitions are the issue of MySQL/Drizzle’s ‘relay log’ which I translated into ‘replikasjonslogg’ – which mainly means “replication log”. This sounds much better in Norwegian, but suddenly the code mentioned both a “replication log” and a “relay log”. I tried finding out what the semantic difference in MySQL were, but were unable to grok anything from the MySQL manual or through a Google search. If anyone has any advice here, it’d be very appreciated. I also made a few notes of where there are obvious errors in the original english strings:

Error on close of '%'s (Errcode: %d)
 - Located in mysys/errors.c:28 

Errcode:
Can't read value for symlink '%s' (Error %d)
 - Located in mysys/errors.c:47 
Can't create symlink '%s' pointing at '%s' (Error %d)
 - Located in mysys/errors.c:48
Copy text Error on realpath() on '%s' (Error %d)
 - Located in mysys/errors.c:49 

%*s(Defaults to on; use --skip-%s to disable.)
 - Missing space
 - Located in mysys/my_getopt.c:1170 

The event could not be processed no other hanlder error happened
 - hanlder
 -  Located in mysys/my_handler_errors.h:118 

SSL information in the master info file ('%s') are ignored because this MySQL slave was compiled without SSL support.
 - MySQL
 - Located in drizzled/rpl_mi.cc:276 

Slave I/O thread killed while waitnig to reconnect after a failed registration on master
 - waitnig
 - Located in drizzled/slave.cc:90 

Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog'..
 - mysqlbinlog
 - Located in drizzled/slave.cc:1864 

Found wrong key definition in %s; Please do "ALTER TABLE '%s' FORCE " to fix it!
 - feil spaceplass
 - Located in drizzled/table.cc:1162 

Table '%-.64s' was created with a different version of MySQL and cannot be read
 - MySQL
 - Located in drizzled/table.cc:1818 

I could probably submit a patch for this, but seeing as the source is very much in flux these days, I think I’ll wait until it settles down a bit — unless someone is interesting in reviewing and committing an “unimportant” patch at this stage.

BTW: Launchpad worked great for doing translations, so I’m going to look into using gettext and Launchpad for doing translations for pwned.no and my other services in the future.

Finding Your Way Around for the Weekend

If you’re interested in a bit of a weekend challenge, we’re currently hosting a small PHP coding competition at one of the IRC channels I’m active in. The challenge’s deadline is this sunday (10th of August, 2008) at midnight (if you’re in another timezone, just use your own sunday’s midnight), but the challenge should always prove interesting later too.

If you’re interested in trying to solve labyrinths with the smallest amount of source code, this might be for you!

The End of an Era

An era has come to an end as we say goodbye to PHP4 today! The official support for PHP4 has now been dropped, and the branch was ended with the release of PHP 4.4.9 yesterday. Reading through Derick’s post about the end of PHP4, I actually got quite a bit nostalgic. All the main release points for the PHP4 series (except for PHP 4.4.x which I’ve never spent any time worth mentioning with) includes things that we rely on each and every day today.

Here’s to the same trend for PHP5 when we look back when PHP6 has been released and spent a couple of years in the wild (although I can say quite a few things I’d never live without from PHP5 already..).

BTW: Christian also notes why you should be afraid if you’re considering installing the new PHP version.

I Made It!

After a cruelling 81 kilometers and 4 hours and 47 minutes, I finally reached Halden from Strömstad! Magne finished at 4 hours and 4 minutes, 43 minutes ahead of me. Every possible muscle feels sore at the moment, but it was a great experience. Wouldn’t do it again tomorrow, but there’s a good possibility that we’ll do the race next year too. It rained all morning, but just before we were about to start, the rain magically stopped. The gravel roads where however very wet and soggy, which meant that you had to pedal all the time, all the way. The bike just wouldn’t roll, as the tires got loads of traction and stayed down on the ground. Anyways, weeee’re done!

.. and now I’m doing Birkebeinerrittet in the last weekend of August. Not really looking forward to it at the moment, but that’ll change in a few days.

Photo by Aleksander Toppe!

Mouse Stuck in Lower Right Corner in Ubuntu Intrepid under VMWare?

If your mouse pointer is stuck in the lower right corner after upgrading to the current alpha version of Ubuntu, Intrepid, the reason is that the vmmouse driver got borked somewhere between Hardy and the current version of Intrepid. This means that you’ll only see “Moving to desktop 2” (or whatever desktop you have in the lower right corner). The fix is to change the reference to ‘vmmouse‘ in /etc/X11/xorg.conf to ‘mouse‘. This is not as good (you’ll not be able to move your mouse pointer from your host to your vmware machine seamless any longer), but at least it works.

You can track the current progress over at trackpad: Ubuntu Bug #248521.

Going Cross Countries – Grenserittet 2008

Well, the day TOMORROW is approaching fast (just an hour left..) – the day when Magne and I are participating in Grenserittet 2008. Grenserittet (or “Cross Countries” as it’s called in english) is an annual bike race from Strömstad in Sweden to Halden in Norway. It’s a total of 81 kilometers in length, with a few downhills and uphills along the way. Nothing major, but it’s still going to be a good challenge after I picked up biking again in May. Our goal is to finish in under 4 1/2 hour, so we’ll see how that goes. You can search for my name at grenserittet.com and follow our (or at least my) progress along the route.

I’m going to let my Garmin Edge 705 record the run, so hopefully I’ll be able to provide a few graphs and paths after I’ve finished. Here’s to good weather and a not too warm day!

Rounding Up The Remaining Database Posts

To finally be able to close my now-ready-to-be-archived Firefox-window, I’m rounding up the three other posts I were going to post about in one single batch here:

Ulf Wendel has a post up about PDO_MYSQLND: The new features of PDO_MYSQL in PHP 5.3. Besides being yet another introduction to how MYSQLND differs from the regular libmysqlclient, Ulf writes in detail about how mysqlnd brings other speedups to PDO in general, by allowing the drivers to return zvals directly. This allows the driver to return data without requiring an explicit copy by the overlying architecture. Interesting stuff and well worth a read for anyone, regardless if you actually know what a zval is.

Nicklas Westerlund has a post about MySQL Back to Basics: Analyze, Check, Optimize, and Repair on the pythian blog, featuring a overview of the useful – and abused – methods of rescuing and keeping your data intact. Do regular and good backups. It’s as easy as that. This might however help when you’re in a hurry or needs to fix a corruption that has occured. Read it.

The last item on today’s list is Sphinx – a free open-source full-text search engine. Sphinx uses it’s own indexing and retrieval system, while Solr is built on top of Lucene. Haven’t had much time to play with it yet, but it’s worth checking out. A native PHP module has also popped up (and that’s where I read about it just now), so if you need a fast and native PHP interface to a full search engine without blowing the big bucks, this may be what you’re looking for.

The Day of Four Letter Abbreviations: BASE vs ACID

Any person that has spent some time with databases will have heard the acronym ACID some time or the other. ACID is what transactions are aimed to achieve, so that you’re able to define a set of operations that are connected to each other. ACID is a definition of what these transactions should be able to provide to the user of the RDBMS.

Via xarpb I got hold of an article from ACM Queue, titled BASE: An ACID Alternative:

If ACID provides the consistency choice for partitioned databases, then how do you achieve availability instead? One answer is BASE (basically available, soft state, eventually consistent).

BASE is diametrically opposed to ACID. Where ACID is pessimistic and forces consistency at the end of every operation, BASE is optimistic and accepts that the database consistency will be in a state of flux. Although this sounds impossible to cope with, in reality it is quite manageable and leads to levels of scalability that cannot be obtained with ACID.

Proves to be a very interesting read, and something that’s very in time with how we’re already doing large partitioning of large databases, where the application tier is responsible for a lot of the distributed “transactions”.

Complaining About “Having To Be Commercial” in the Time of Digital Democracy

Norway’s largest newspaper had an article yesterday featuring Christer Falck where he complains that indie music (in the form of music that has a more narrow public appeal) is not getting the attention it should from the major music television networks. Here in Norway that means The Voice and MTV (and a few shows at NRK), which mainly is considered with RnB (The Voice) and a bit more general commercial appealing music (MTV).

The reason why Christer is in the news now is that he’s launching a new artist on his label, Hilde Marie Kjersem. VG provides him with amble news space where he’s able to provide a link to her fresh music video. This is of course a shameless promotion stunt from Christer to be able to get his newly released artist out on the front page of Norway’s largest newspaper, but his complaint is still interesting.

The problem is that it’s an invalid complaint. Yes. The major music tv networks is prioritizing music with a broad commercial appeal more than niche music, but after all, they’re commercial entries. If twice as many people see the commercials they run between the music if they play certain songs, they’re going to play certain songs. That’s probably no surprise. But what Christer seems to forget, is that it’s NEVER been easier to actually make a buzz by yourself and to get your own music out to millions of people without having an enormous budget. Just a couple of years ago we’d never be able to just upload our video to a site which would host the video for FREE, which works with almost all available file formats and which allows you to embed the video on your OWN PAGE. We have myspace where thousands of unsigned and signed artists share their music, where you can listen to huge amounts of good — and even more — bad music. If you decide to create music that you want to share with the world, the possibilities are there! You don’t NEED the music networks to play your music to make it known to people.

I’d certainly download and listen to an album distributed online, and I’d happily pay a couple of dollars to get the 192kbps MP3s upgraded to lossless FLACs. Make the MP3s available for free, set up a playlist with your album hosted on youtube, on myspace or make your own service. Just make it available. If I can send a link to a friend and ask him or her to check out a new album I just found, I’d be more than happy to share good music.

So here’s where we’re going to say something that never will happen:

Make the entire album available online for free. Host the download through a bittorrent network, most people are more than happy to contribute bandwidth to legitimate issues. Make it possible to pay for a simple lossless download for those of us who treasure music quality (and while I’m probably not going to hear any difference between 320kbps mp3 and FLAC, the size difference isn’t worth a degraded copy any longer), and let us pay a REASONABLE amount. The $20 that an album costs in Norway today isn’t reasonable. I’ve never bought more music than when allofmp3 was still available and you could buy whole albums in FLAC for $2-3. I spent more money then than I’ve done ever since.

There are ways of getting huge amounts of publicity when you’re doing things the reasonable way.

I’m actually thinking about the concept of setting up a site where people can make full album downloads available in good quality. Imagine if you could get your music into the vast amounts of media centers available in the world, without a single distribution cost. Together with a simple “buy this in lossless quality” link that charges a few dollars for the service and a specific hosted version of the file. Bandwidth is a commodity that’s getting cheaper and cheaper.

So Christer, while it’s a good tactic for getting some attention for a release, I dare you to instead take on the fight to the complete established music industry. And if you’re going to make it without the help of the networks, you’ve got to get a better website for C+C Records than what you currently have.