Unit-Testing Code Which Uses a Database

How to unit-test code that interacts with a database appeared on the blog of Baron Schwartz, and to be really boring, I agree with what he’s writing. Unit-testing database connectivity and storage is not hard. If it is, it might be a good time to redo that architecture you’ve been talking about.

An important point that Baron mentions is that you _NEVER_ _EVER_ run your tests on your production servers. That will of course be disastrous, as your tests needs a predefined state of the database to be valid for testing. The solution I’ve been using to handle this, is to always set up my environment to use another database when doing the tests. This way, you’ll never end up with running the tests on a live database by accident. I handle this in my AllTests.php file, where the test suites and shared fixtures are set up. We dump the contents of the developer database (databasename), create a new database (databasename_test) and insert all the current table structures and indexes. This way we get an accurate copy of the table definitions currently defined by the developer (so that we don’t run the tests against an old set of tables), and we test that the code works as it should with the active definitions.

The simplest way to do this, is to use mysqldump and mysql through a call to exec. If you’re not in a trusted environment, please, please, please add the appropriate shell argument escape commands. It can however be argued that if you’re allowing random people to change your database login information, you probably have bigger problems than doing unit testing..

exec('mysqldump -u ' . $username . ' -p' . $password . ' ' . $dbname . ' | mysql -u ' . $username . ' -p' . $password . ' ' . $database . '_test');

It would be very interesting to get more information about which measures Baron advocates for detecting a production system. We have configuration settings for our applications which also defines if this is a development or production system, in addition to the fact that our testing code only touches databases which end in _test.

The Results of our Weekend Challenge

The weekend PHP size competiton I mentioned on friday has come to an end, with the results being as follows:

      CNU (253 bytes)
      Helge (261 bytes)
      Ymgve (267 bytes)
      Me (365 bytes)
      dibon&zep (which never delivered, but had working solutions)

For those who are interested in the strategies and tactics employed by the contestants, I’ve included a small write-up and analysis of the various contributions. We decided early on that using eval() and gzinflate on the content were something that everyone could apply, so the size would be counted for the decompressed code. If any contestant implemented their own decompressor in the user space code themselves, we would accept that. It would not be any advantage anyways, as the decompressor code would eat up more space than the best contributions used in total.

I’ve included the writers own comments if they had any. The biggest change in size happened when people started using one dimensional strings instead of arrays.

CNU’s contribution

$b='';foreach(file('maps.txt',2)as$l){if(!$l){$q[1]=strpos($b,'.');for($b=str_split($b);$p=&$q[++$l];)foreach(array($p-$w,$p+$w,$p-1,$p+1)as$x){$c=$b[$p]+1;$d=&$b[$x];if($d=='X')echo$c.'
';if($d==' ')$q[]=$x;$d=$c;}$q=$b='';}$b.=$l;$w=strlen($l);}

Helge’s contribution

$m="";foreach(file("maps.txt")as$l){if($l!="
"&&($m.=$l)+$w=strlen($l))continue;for($y=$d[$v[]=strpos($m,'.')]=0;strpos($m,'X')-$u=$v[$y++];)for($x=-$w;$x<=$w;$x+=$x+1?$w-1:2)in_array($n=$u+$x,$v)|$m[$n]=="#"||$d[$v[]=$n]=$d[$u]+1;echo"$d[$u]
";$m=$v="";}

Helge also included an annotated version with comments in norwegian:

/** 
 * $l = én linje (av maps.txt)
 * $m = ett map
 * $w = vidden til et kart
 * $d = distansen til et vilkårlig punkt
 * $v = open/closed-array (putter både squares 
        of interest og visited squares her)
 * $u = "current square", utgangspunktet til å 
        generere naboer
 * $n = nabo
 * $x = hjelpevariabel for å generere $n
 * $y = hjelpevariabel for å generere $u
 */


/* Klarte aldri å finne en måte å skippe initialisering av 
   $m her.. (den resettes på siste linja og.. føles waste)
   * En idé var å bruke to for(;;)'s istedenfor en foreach 
   og continue, men jeg tror jeg hadde økt i kode da.. */
$m="";

/* Foreach var nyttig fant jeg ut.. den stopper når filen er 
   ferdig lest av seg selv, så slipper kode på sjekking av det, 
   $f[$y++], osv. */
foreach(file("maps.txt") as $l) {
    if(
        /* Når $l=="\n" (som betyr slutten på et kart) skal 
           vi slutte å appende til $m og heller breake if'en 
           så vi kan starte pathfinding */
        $l != "\n" &&

        /* Bruker en 1-dimensjonal string istedenfor array 
           (= win!) */
        ($m .= $l) +

        /* Trenger vidden når jeg skal regne ut naboene til et 
           gitt felt. */
        $w = strlen($l)
    )
        /* Gjør så $l blir feedet en ny linje av kartet vårt til 
           kartet er ferdig */
        continue;

    /* Pathfinding start! */
    for(
        /* Setter $y og $d[starten] lik 0, og putter startposisjon 
           inn i $v i samma slengen. */
        $y = $d[ $v[] = strpos($m, '.') ] = 0;

        /**
         * Gjør så for(;;) breaker når sluttposisjon-$u blir 0 
         * (synonymt med $u == strpos($m, 'X');
         * Tar også neste $u fra $v i samma slengen */
        strpos($m, 'X') - $u = $v[ $y++ ];
        
    )
        /* For hver nabo */
        for(
            /* Genererer naboer.
             * Den starter på -width (som blir opp), går så til -1 
               (som blir venstre), 1 (som blir høyre) og tilslutt 
               width (som blir ned) */
            $x = -$w;
            $x <= $w;
            $x += $x+1 ? $w-1 : 2
        )
            /* Om naboen (summen av $u og $x) er i $v: continue */
            in_array($n = $u+$x, $v) | /* fancy bitwise or som kan 
                                          brukes fordi det er snakk 
                                          om 2 booleans */

            /* Om naboen er en '#' på kartet: continue */
            $m[$n] == "#" ||

            /* Hvis ikke, sett distansen til naboen lik distansen 
               til $u + 1 og putt naboen inn i $v */
            $d[ $v[] = $n ] = $d[$u]+1;

    /* Skriv ut distansen til $u (skal være sluttpunktet nå) */
    echo "$d[$u]\n";

    /* Disse må resettes.. */
    $m=$v="";
}

Ymgve's contribution

foreach(file("maps.txt")as$l){for($i=0;-75<$e=ord($l[$i++])-88;$m[]=$e+56?$e:1e6,$z=array(-1,1,-$i,$i))$e+42||$q[0]=count($m);if($i<3){for($d=$k=$c=0;;$c++-$k||$i++&$k=$d)for($j=4;$j;$g>$i&&($g=$i)&$q[++$d]=$p)if(!$g=&$m[$p=$q[$c]+$z[--$j]])break 2;echo"$i
";}}

My own contribution

Breadth-first Search, Delivered Version, 365 bytes:


Prettyprinted:


This is mostly your usual run-of-the-mill BFS, where the search itself is implemented as a simple queue.

We loop through all the lines in the file (observe that you can remove the space character on both sides of as, saving you two bytes!):

foreach(file('maps.txt')as$l)

The part within the else condition builds the map to search for the exit point:

    // for each line in the file, we do this
    for($x=0;$l[$x++]!="
";$m[$y][$x]=$l[$x]=="#"?0:$l[$x])
        if($l[$x]=='.')$q[]=array($y,$x,0);

The for() loops through each character on the line, until it hits the newline marker at the end. Instead of entering "\n", you simply use the actual newline. This way you save another byte. The map ($m) is then populated using the current y,x coordinate (y being the line, x being the character on that line).

$m[$y][$x]=$l[$x]=="#"?0:$l[$x] can be expanded to:

if ($l[$x] == "#")
{
    $m[$y][$x] = 0;
}
else
{
    $m[$y][$x] = $l[$x];
}

Meaning that if the character is a wall, we store the value zero; if any other value, we store the character itself. We'll use this attribute when searching to determine when we've reached our goal. All of this is kept inside the for()-construct itself. The only thing contained in the for()-loop, is the statement if($l[$x]=='.')$q[]=array($y,$x,0);. This checks if the current character is the starting point, and if that's the case, adds it to our queue of points to check (making it the starting point of our BFS).

When we hit an empty line in the file (through the if("
"==$l) check), we execute our BFS.

    /* while $q is valid (this handles empty labyrinths), we fetch the 
        current y and x coordinates, in addition to d, the distance to 
        the location we're fetching from the queue. No {} here, as the 
        for only contains the if() statement below. */
    for(;$q&&list($y,$x,$d)=array_shift($q);)

        /* if this spot has a value (remember that we assigned 0 to walls, 
            while we kept the character value for other fields when we 
            parsed the file. We use empty() as the coordinates may not be 
            valid indexes into the array, as we don't do any checks when 
            we add them. one call to empty() is better than four ifs and 
            duplicate array references). */
        if(!empty($m[$y][$x]))
        {
            // create a reference (saves one byte instead of referencing $m[$y][$x] twice.
            $k=&$m[$y][$x];

            /* if we've found the X, echo the distance to the X. Under any 
                under circumstance, you'd break; out of the for/while here, 
                but as bytes count more than speed, we save six bytes by 
                searching the whole labyrinth instead of ending early. Funny 
                detail: as echo is a language construct, you can skip the space. 
                If the character is not 'X', we simply echo an empty string 
                instead of using a conditional echo. Saves a couple of bytes. */
            echo$k=='X'?"$d
":'';

            // unset the value, so that we don't visit this node again.
            $k=0;

            /* add all neighbours to our queue. We pre-increment $d, instead 
                of adding +1 in all the adds. */
            $q[]=array($y,$x-1,++$d);
            $q[]=array($y-1,$x,$d);
            $q[]=array($y,$x+1,$d);
            $q[]=array($y+1,$x,$d);
        }
        
        /* Reset the values. PHP generates a warning if a value has never
            been set before when used, but will happily accept null values 
            as empty arrays or 0. The double assignment is also a trick worth 
            remembering to save bytes. */
        $m=$y=null;

Some of tricks used in the competition

  • Multi-variable assignment: $a=$b=0;
  • Actual newline when detecting a newline: if ($a=="
    "){}
  • Using null to be able to reset both integers and arrays in one statement.
  • The more code you can fit into a for() statement, the better. You get three "free" line endings by doing this. for(;;) is the same length as while(), but you can get several statements evaluated.
  • You can skip the spaces in foreach().
  • Create references to avoid using long array indices several times.
  • The ternary operator (?:) is your friend!
  • Remember that {}'s can be dropped when you have a single statement that contains all subsequent code (as long as you don't get any conflicts with else etc. on different levels):
    for(;;) <-- no { here.
    if ()
    {

    }

  • Simulating arrays with strings

The scripts were run with a set of pre-made labyrinths and a collection of 20 random labyrinths (more like maps, but they still test the code the same). The script for creating the labyrinths is available here. All delivered contributions passed all tests.

Finding Your Way Around for the Weekend

If you’re interested in a bit of a weekend challenge, we’re currently hosting a small PHP coding competition at one of the IRC channels I’m active in. The challenge’s deadline is this sunday (10th of August, 2008) at midnight (if you’re in another timezone, just use your own sunday’s midnight), but the challenge should always prove interesting later too.

If you’re interested in trying to solve labyrinths with the smallest amount of source code, this might be for you!

The End of an Era

An era has come to an end as we say goodbye to PHP4 today! The official support for PHP4 has now been dropped, and the branch was ended with the release of PHP 4.4.9 yesterday. Reading through Derick’s post about the end of PHP4, I actually got quite a bit nostalgic. All the main release points for the PHP4 series (except for PHP 4.4.x which I’ve never spent any time worth mentioning with) includes things that we rely on each and every day today.

Here’s to the same trend for PHP5 when we look back when PHP6 has been released and spent a couple of years in the wild (although I can say quite a few things I’d never live without from PHP5 already..).

BTW: Christian also notes why you should be afraid if you’re considering installing the new PHP version.

PHP, ImageMagick and Cropping to GIF: Digging into GIFs again!

Christer had an interesting case today, where he tried to resize and crop an image with the Imagick extension for PHP. Everything went as planned, the image was cropped and resized at it should be, but after writing it to disk and opening it again, the image’s size was the same as if he hadn’t done the crop. The content of the image outside the crop area was removed (simply set as transparent), but the image was still returned in it’s uncropped size.

The PHP module for binding ImageMagick is quite simple (simply marshalling between the ImageMagick methods and the PHP user space), so my guess is that this is a weird behaviour with a good enough reason somewhere down in ImageMagick. It might be a bug, but I haven’t had the time to attempt to reproduce it with convert or mogrify yet. If anyone wants to attempt that, feel free. Christer has posted the code, so simply attempt to recreate the same symptoms by using one of these two tools.

Anyways, this post was not to be about the issue itself, as Christer has done a neat analysis and write-up of that, but I’ll give a more detailed look at the issue within the GIF file itself. As chance would have it, I recently participated in a competition at the norwegian demoscene IRC hangout where the goal was to recreate the norwegian flag in an HTML page in the smallest space possible. This ended up being a competition to see who could molest and optimize GIF images the most, while browsers still were able to display them. From this experience I had a quite good knowledge of how GIF files are built internally, and I were able to do a good guess of what could be the actual issue in the resulting file.

Since GIF files can be animated, a single file may contain several “images” (which would be the frames in the animation). These images can have their own size and position within the “larger image”:

 _________
| im1     |
|    _____|
|   |     |
|   | im2 |
|___|_____|

im1 may then represent the first image and im2 the second image in the file. The second image will only update the area that it covers, and this will leave the rest of the image “as it is”. Since a GIF image may contain a large number of these images, a “global” size is defined for the image. This global size covers all the images, and is the total area that these images will be drawn into. If an image is drawn outside of this area (in part or whole), it will be clipped against the viewport.

This should provide enough background to at least give a general feeling about what COULD be the problem here, but to actually find out what’s happening, we’ll dig into a GIF file format specification and the file that was created. This simple reference provides a general layout of the GIF file, and we’ll use that to take a look at what values the file we ended up with had:

On the left we have the actual byte values in hex and on the right we have the corresponding ASCII character represented by that value. As you can see, the first six bytes of the file (0x47 0x49 0x46 0x38 0x39 0x61 (0x is the general way of prefixing numbers that should be interpreted as hexadecimal)) corresponds to “GIF89a” (You can do this exercise yourself armed with this Ascii Table. Simply look up 47 in the Hx column, then 49, etc). Those six bytes are what we call the signature of a GIF file (although the number can be different, i.e. GIF87a, depending on the version used).

The next fields in the specification reads:

Offset   Length   Contents
  6      2 bytes  
  8      2 bytes  

So byte 6-7 and byte 8-9 should tell us the logical size of the whole gif file (which the images will be drawn onto). In our test file here, that’s represented as:

Width: 0x67 0x01
Height: 0x70 0x00

The byte order here is Little Endian, which means that the least important values are placed first. Since we have two bytes for each value, we can calculate the decimal value of the width by multiplying:

0x67 0x01 = 6 * 16 + 7 + (0 * 16 + 1) * 256 = 359
                                        ^-- Since we're in the next byte, we multiply with 256.

You can also do this with the windows calculator, by entering 167 while being in hexmode, then selecting dec (for decimal). The reason for multiplying the second byte with 256 is that this byte provides the value of the “next 8 bits”, while the first provided the value for the first 8 bits. If we see the bits themselves:

0x70 | 0x01: 0111 0000 | 0000 0001

Little Endian says that the least significant bits come first, so to get the raw bit values, we turn it around:

0000 0001 0111 0000

As you can see, the value of the second byte (0x01) can be multiplied with 256 (which is the last 8 bits).

We can also calculate the height:

0x70 0x00 = 7 * 16 + 0 + (0 * 16 + 0) * 256 = 112
                          ^-- both numbers in the second byte is zero

Alas, the global header of the GIF image that were generated says that the size of the image is 359×112, which is why the image is rendered larger than it should have been. We then take a look at the Image section of the GIF file (all GIF files should contain at least one), which is defined as:

Offset   Length   Contents
  0      1 byte   Image Separator (0x2c)
  1      2 bytes  Image Left Position
  3      2 bytes  Image Top Position
  5      2 bytes  Image Width
  7      2 bytes  Image Height

Armed with this information, we examine the area where the image section starts:

The start of the Image section is the “Image Separator”, a byte value of 0x2c, shown highlighted in the image above. This is where the image section starts, and the offsets in the table is relative to this location. The next four bytes tells us where in the global viewport the upper left corner of this image should be drawn. The values here are 0x01 0x00 twice, simply meaning (1,1), or one pixel down and out from the upper left corner (which is also related to the issue posted by Christer, but we ignore that one here now). The next values are however those we are interested in, which provides Image Width and Image Height:

Width:
0x73 0x00 = 7 * 16 + 3 + (0 * 16 + 0) * 256 = 115
Height:
0x6F 0x00 = 6 * 16 + 15 + (0 * 16 + 0) * 256 = 111

This means that the dimension of the image that’s actually supplied in the GIF file, is 115×111 pixels and should be drawn beginning one pixel down and one pixel out (as given by 0x01 0x00 in the x,y-fields above). Compare this to the reported global size of the image (359×112), and we can see where our transparent space is coming from. The browsers (and other image viewers) create a canvas the size of 359×112 pixels, while only drawing an image into the 115 leftmost pixels. The rest is left transparent, but they’re still there as the file says that’s the size of the viewport. If we manually change the size of the viewport to 0x74 0x00 in the GIF header itself, the image displays properly. To illustrate with another great ascii drawing:


               viewport
 _____________________________________
|           |                         |
|  actual   |                         |
|  image    |                         |
|  drawn    |                         |
|           |                         |
|           |                         |
|           |                         |
|___________|_________________________|

The solution to the problem here were to call the setImagePage method of the image object, as that allows us to set the values for the global image ourselves (and we know how wide the image were supposed to be).

Bonus knowledge: This issue did not occur when saving to a JPEG file, as JPEG files does not have the same capability of storing several subimages inside one file, and does not have the same rendering subsystem as GIF files. ImageMagick knows this, and does not use the page-values when rendering the file.

Hopefully this has provided a minor introduction into how files are structured, what you can learn armed with a hex editor and a file format specification and provided a few insights into what you can do when you’re faced with a very weird problem.

Fatal error: Exception thrown without a stack frame in Unknown on line 0

While Christer were updating his profile on one of our current projects, he suddenly got this rather cryptic message. We tossed a few ideas around, before just leaving it for the day. We came back to the issue earlier today, and I suddenly had the realization that it had to have something to do with the session handling. Changing our session handler from memcache to the regular file based cache did nothing, so then it had to be something in the code itself.

The only thing we have in our session is an object representing the currently logged in user, so it had to be something in regards to that. After a bit of debugging I found the culprit; a reference to an object which contained a reference to a PDO object. PDO objects cannot be serialized, and this exception was being thrown after the script had finished running. The session is committed to the session storage handler when the script terminates, and therefor is run out of the regular script context (and this is why you get “in Unknown on line 0”. It would be very helpful if the PHP engine had been able to provide at least the message from the exception, but that’s how it currently is.

Hopefully someone else will get an Eureka!-moment when reading this!

The solution we ended up with was to remove the references to the objects containing the unserializable structures. This was done by implementing the magic __sleep method, which returns a list of all the fields that should be kept when serializing the object (we miss the option of instead just unsetting the fields that needs to be unset and then let the object be serialized as it would have if the __sleep method wasn’t implemented). We solved this by implemeting the __sleep method and removing our references, before returning all the fields contained in the object:

public function __sleep()
{
    $this->manager = null;
 
    return array_keys(get_object_vars($this));
}

And there you have it, one solved problem!

Google Releases Their Protocol Buffers

Fresh from the Google Open Source Blog comes news that Google has released their Protocol Buffers specification and accompanying libraries. The code and specification has been release at Protocol Buffers on Google Code.

Protocol Buffers is a data format for fast exchange and parsing of data and messages between computers. It is similar to simple uses of XML in this manner, but the messages size on the wire and their parsing time is very much optimized for busy sites. There is no need to spend loads of time doing XML parsing when you instead could do something useful. It’s very easy to interact with the messages through the generated classes (for C++, Java and Python), and future versions of the same schema are compatible with old versions (as new fields are just ignored by older parsers).

Still no PHP implementation available, so guess it’s time to get going and lay down some code during the summer. Anyone up for the job?

Using Apache httpd as Your Caching Solution

In this article I’m going to describe a novel solution for making cached versions of dynamic content available, while attempting to strike a balance between flexibility, performance and the origin of dynamic content. This solution may not be suited for very dynamic content (where the updates are better triggered by rewriting the cached version when the content changes), but in those situations where the dynamic content may be built from a very large dataset on request from the users. I have two use cases detailing applications I’ve been involved in building where I have applied this strategy. This could also be implemented with a caching service in front of the main service, but will require the installation of a custom service and hardware etc. for that service.

The WMS Cache

WMS (Web Map Service) is an OGC (Open Geospatial Consortium) specification which details a common set of parameters for how to query a web service which returns a raster map image (a regular png/jpg/bmp file) for an area. The parameters include the bounding box (left,bottom,right,upper) and the layers (roads,rivers,etc) and the size of the resulting image. The usual approach is to add a caching layer in the WMS itself, so any generated image is simply stored to disk, and then checked if the disk exists before retrieve the data and rendering the image (and if it exists, just return the image data from disk instead). This will increase the rate of requests the WMS can answer and will take load off the server for the most common requests. We are still left with the overhead of parsing the request, checking for the cached file and most notably, loading our dynamic language of choice and responding to the request. An example of such a small and naive PHP application is included:


The next request which arrives with the identical set of GET-parameters, will be served with the overhead of loading PHP, parsing the PHP-script (which is less if you have APC or a similar cache installed), sorting the GET-parameters (so that bbox=..&x=.. is the same as x=..&bbox=..), serializing the response, checking that the file exists on disk (you could simplify this to just doing a read and checking if the read succeeded), copying the data from disk to memory and then outputting the data to the client (you could also use fpassthru() and friends which may be more optimized for simple reading and output of data, but that's not the main point here).

To relate this to our use case of the WMS, we need to take a closer look at how map services are used today. Before Google showed the world what a good map solution could look like with modern web technology, a map application presented an image to the user, allowed the user to click or drag the image to zoom or move, and then reloaded the entire page to generate the new image. If it took 0.5s to generate the image, that were not really a problem, as the data set is not updated very often and it is very easy to do these operations in parallel across a cluster. When Google introduced Google Maps, they loaded 9 visible images (tiles) in the first image, and then started loading other tiles in the background (so that when you scroll the map, it looks like the images are already in place). If you run an interface similar to Google Maps against a regular WMS, most WMS servers would explode and take the whole 42U rack with them. Not a very desirable situation. The easy solution if you have an unlimited set of resources, disk space and money is to simply generate all the available tiles up front, in the same way as Google has done it. This will require disk space for all the tiles, and will not allow your users to choose which layers then want included in the map (this will change as map services are starting to build each layer as a separate tile and then superimposing them in the user interface).

The problem is that most of us (actually, n - 1) are not Google, but most of us do not build map services either. For those of us who do, we needed a way of living somewhere in between of having to render our complete dataset to image tiles up front or running everything through the WMS. While working with Gunnar Misund at Østfold University College, I designed a simple scheme to allow compatible clients to fetch cached tiles automagically, while those tiles which did not exist yet, were generated on the fly from the background WMS. The idea was to let Apache httpd handle the delivery of already generated and cached content, while our WMS could serve those areas which were viewed for the very first time (or where the layer selection were new). It would not be as fast as Google Maps for non-cached content, but it wouldn't require us to run through our complete service to generate images either.

The solution was to let the javascript client request images through a custom URL:

http://example.com/300/400/10/59.205278/10.95/rivers,roads/image.jpg

(This is just an example, and does only contain the center point of the image). This is decomposed into:

http://example.com/x_width/y_height/zoomlevel/centerlat/centerlon/layers/image.fileformat

This is all good as long as image.jpg exists in the local path provided, so that Apache can just serve the image as it is from the location. Apache httpd (or lighttpd and other "serve files fast!"-httpds) are able to serve these static files in large numbers (it's what they were written for, you know..) with a minimum overhead. The problem is what to do when the file actually does not exist, which will happen each time a resource is requested for the first time, and we do not have a cache yet. The solution lies in assigning a PHP-file as the handler for any 404 error (file not found). This is a well known trick used all over the field (such as handling www.php.net/functionname direct lookup). In PHP you can use $_SERVER['REQUEST_URI'] to get the complete path of the request that ended in the 404.

The .htaccess file of the application is as simple as cake:

ErrorDocument 404 /wms/handler.php

I've enclosed a simple specification which were written as a description of the implementation when the project was done in 2005.

Thumbnail generation

Generating thumbnails can also be transformed into the same problem set. In the case where you need several different sizes of thumbnails (and different rescales are needed for different applications), you can apply the same strategy. Instead of handing all the information to a resize script with the file name etc. as the argument, simply have the xsize and the ysize as part of the URL. If the file exists in the path, it's served directly with no overhead, otherwise the 404 handler is invoked as in the previous example. The thumbnail can then generated, saved in the proper location and the world can continue to rotate at it's regular pace.

This application can then be extended by adding new parameters in the url, such as the resize method, if the image should be stretched, zoomed and other options.

Conclusions

This is a very simple scheme that does not require any custom hardware or server software installed, and places itself neatly in between having a caching front end server between the client and the application and the hassle of generating the same file each and every time. It allows you to remove the overhead of invoking the script (PHP in this case) for each request, which means that you can serve files at a much greater rate and let your hardware do other, more interesting things instead.

PHP Vikinger Notes

Just a few notes from PHP Vikinger which were arranged by Derick Rethans in Norway today. Things went mostly smoothly and people in general seemed to have a very good time. These are just some of the random notes I made during the sessions.

All in all it was a good unconference, with a friendly and laid back tone and hopefully people got what they came for. Next time I’ll try to prepare a simple presentation on some interesting and hopefully not too familiar topic and actually contribute something too. We drove from Halden and Fredrikstad to Skien in the morning and back in the evening, which worked out quite OK, except for .. well, the lack of sleep in the morning. But everyone survived and managed to stay awake, so I conclude that the trip was a great success.

To sum it all up: a banana is a fruit and a tomato is a berry. You probably had to be there for that one.

Thanks for the unconference, and hopefully I’ll be able to attend more events in the future too.

UPDATE: Derick also has a writeup online from PHP Vikinger.

Derick and Sebastian Readying a Presentation

A Redirect Does Not Stop Execution

This is just a public service announcement for all the inexperienced developers who are writing redirects in PHP by issuing a call to header(“Location: <new url>”) to do their redirect. I see the same mistake time over and over again, and just to try to make sure that people actually remember this:

A Call to Header (even with Location:) Does NOT Stop The Execution of the Current Application!

A simple example that illustrates this:

 /* DO NOT COPY THIS EXAMPLE. */

if (empty($_SESSION['authed']))
{
    header('Location: http://example.com/');
}

if (!empty($_POST['text']))
{
    /* insert into database */
}

/* Do other admin stuff */

The problem here is that the developer does not stop script execution after issuing the redirect. While the result when testing this code will be as expected (a redirect happens when the user is not logged in and tries to access the link). There is however a gaping security hole here, hidden not in what’s in the file, but what’s missing. Since the developer does not terminate the execution of the script after doing the redirect, the script will continue to run and do whatever the user asks of it. If the user submits a POST request with data (by sending the request manually), the information will be inserted into the database, regardless of wether the user is logged in or not. The end result will still be a redirect, but the program will execute all regular execution paths based on the request.