Yahoo!, SearchMonkey and Microformats

Both Rasmus and Sara has posts up about a new feature of Yahoo! Search which actually seems to be a step forward in terms of search engine functionality. This will make it possible for 3rd party developers to actually run code on Yahoo!’s servers to enhance their search result for your own page.

The first examples shows how they’ve used Microformats to give a better presentation of businesses available. I’ve previously implemented the hCard microformat at Derdubor, where we have a local directory search for businesses in Norway. All our search hits and profile pages are tagged up in microformats, so that a compliant parser are able to fetch business information and provide it to our users in a proper way. It’s simply great to see Yahoo! add this kind of support for new formats, and I’m already looking forward to playing with it to give better results for pwned.no and a few other projects I’m playing around with.

Christer Goes Bug Hunting

Christer had the pleasure of hunting down a bug in Zend Framework a couple of days ago, and he has just posted a nice article about the bug in Zend MVC Components and how he debugged it. If you’ve never used a debugger before, this article is probably also going to be a bit helpful, and it gives a little insight into how the Zend Framework MVC-components work.

The Power of Micro Languages

A “micro language” is a small, domain specific language. These small languages are often invented on a project basis or carried over from previous projects (or in some cases, a standardized language is used). The power of this comes as we’re able to move the rules of the application from the application logic, and instead can maintain a seperate set of rules independent of the application. This way you can add and remove business rules without having to re-test or rewrite your application. The concept has been around for years, and it’s in use more places than you could imagine, but I see people over and over again which does not see the benefits of separating the rules from the application itself.

These micro languages can be described in a markup language (such as XML), as a subset of other existing languages, or as your own simple-to-parse language. In this small introduction I’ll use an example from the current codebase of pwned.no, a site for arranging tournaments in several different online games.

When writing the engine that powers pwned.no I wanted to give the users several different options of tournaments to create. You have the regular single elimination tournaments, where winning a match moves you on to the next level, you have the double elimination events, where your team is not out before they’ve lost two matches, and you have the possibility of a group stage where the best teams (and possibly the worst) move on to the finals. Even after this, you realize that you might have situations where you want to combine these different sets of tournaments into a large one (example: elimination -> group stage -> double elimination). Embedding this logic into the application would create an unmaintainable mess of special cases and special handling, and each time I wanted to add a tournament, the possibility of breaking some other tournament were a possibility.

Instead I decided to implement a small language that describes tournaments, a parser to load and process the language and finally store it in an underlying database structure of how the tournament should progress when the next round starts. This meant that the job up front demanded a bit more planning and sketching, but the solution makes the system more flexible and more stable. When someone requests a new tournament format now, I simply create a small file describing the flow in the tournament, and everything works as it should. No code edits, no unit tests that need to be run, nothing at all, as long as the file is a valid tournament description file.

A tournament description file looks like this (for a tournament with a group stage, consisting of eight teams and with a single elimination stage for the two best teams from each group):

ID GROUP8
TEAMS 8
TOURNAMENTSTAGE 3 FINAL
MATCH
.ID FINALE
TOURNAMENTSTAGE 2 SEMIFINALS
MATCH
.ID SEMIFINALE1
.WIN FINALE
MATCH
.ID SEMIFINALE2
.WIN FINALE
TOURNAMENTSTAGE 1 PRELIMINARY_GROUPPLAY
GROUP
.ID GROUP1
.PLACETOMATCH 1 SEMIFINALE1
.PLACETOMATCH 2 SEMIFINALE2
GROUP
.ID GROUP2
.PLACETOMATCH 1 SEMIFINALE2
.PLACETOMATCH 2 SEMIFINALE1

The ID is an internal identifier which is resolved to a string in the currently active language, the TEAMS identifier tells the parser how many teams who can sign up for this tournament, and the rest of the lines are descriptions of the different stages of the tournament. The file can be parsed top-to-bottom, as the identifiers mentioned later in the file already exists higher up in the hierarchy. The .WIN commands tells the parser where the winning team should move next, and the .PLACETOMATCH commands indicates where the different places in the group stage should move next. If I wanted to add a lower bracket too, I’d simply add .PLACETOMATCH 3 LOWERBRACKET1 and a MATCH with .ID LOWERBRACKET1.

You could of course use XML for this instead, but this very simple and very easy to parse language has so far created a total of almost 25.000 tournaments, and everything has worked without a hitch. After resolving the first issues with the parser in my development version, there are now several new tournament formats that has been added (such as a tournament with 256 teams) in the years after the application was released.

This was just a very small introduction to the concept of micro languages, and you’ll find loads of more information about them online. You may ask what the difference between a micro language and a configuration file is, and the truth is that they’re quite similiar. But as the configuration file describes different settings for the application, the micro language describe different rules for the application. You may of course have configuration files that also contain rules, but you’ll need a well-defined and expressive language to define your rules. The domain where this language is used is so small and limited (there are only that many different concepts that are used in tournaments), so a simple language as this fits. If you’re in a situation where the micro language evolves into a full fledged programming language, you’d probably be better off with embedding an existing scripting language (such as Python), or moving the rules out into seperate modules in your application.

David Cummins on Fulltext Search as a Webservice

David Cummins has a neat little post up about replicating some of Solr’s features in a PHP based solution. His post “Fulltext search as a webservice” should sound familiar to Solr’s approach from the title, and David describes how they built a similiar solution on top of Zend_Search_Lucene (Solr also uses Lucene in the backend). Seems like it would be easier to just set up a dedicated Solr cluster instead, but hey, how often has “it would be easier to do something else” sparked innovation?

I’d also like to note that the coming Solr 1.3 supports php serialization as an output format, so you can just unserialize() the response from Solr. Should provide for even easier integration between PHP and Solr in the future. While on the subject, I’d like to suggest reading Stemming in Zend_Search_Lucene too, an introduction to adding filters to Zend_Search_Lucene. Also worth a look is the Search Tools in PHP presentation from phplondon.

Web Frontends for Xdebug

David Coallier has a post up about a Google Summer of Code project that has been launched to finally get a web frontend for Xdebug, something many people have been requesting for quite some time. Simultaneously Webgrind has been released, another web based frontend that replicates some of the features from kcachegrind in a web based fashion. Here we go several years without a decent front end, and then two gets announced in the same week. Neat! You can check out Webgrind at their Google Code page.

PHP 5.2.6 Released!

Ilia just noted that PHP 5.2.6 was released yesterday! Lucky for me, as I have been away for a couple of days at a trackday in Sweden (pictures coming to my flickr page within the night). Among notable fixes are added metadata for the reflection API in the DOM classes and a few PDO-fixes for PostgreSQL. Several security vulernabilities were also fixed, and a complete list can be seen in the PHP 5.2.6 Changelog.

Marco’s Five PHP5 Features You Can’t Ignore

Marco has a neat list up with five different features about PHP5 which people are still not quite catching on to , and I agree with every single item that made it to the list. I’ve been using SimpleXML myself, and except for a few cryptic issues regarding namespaces and iterators (SimpleXML does quite a bit of magic..) it’s a breeze to work with. For simple .. XML .. parsing, it’s ingenious.

For PDO, we’ve already moved all our projects to PDO, and it’s been my preferred method of accessing databases for at least a year and a half already. Great stuff. We’ve also used the json module for quite some time, which neatly ties into jQuery , mootools and other JavaScript APIs. I still haven’t used SPL that much, but that might change soon. Anyways, a good read for anyone who still live in the PHP4 world..

Mikko Posts About Imagick in PHP and Fill Patterns

Mikko is quite active with the development of the Imagick-extension in PHP, possibly the best thing to hit PHP since it’s birth over ten years ago. There’s nothing like the Imagick-extension to make you realize how much you’ve been missing from the GD extension (kindly reworded for Pierre) :-) Anyways: Mikko has a new post up about how to use an image as a fill pattern in Imagick under PHP . Well worth a read!

You should also check out all the other interesting Imagick related posts Mikko has made.

Christer Puts on His Problem Solver Hat

I’m not sure how much sugar Christer got into his system today, but he’s been completely on fire with his blog posts. This one shows how to write a custom Zend_Form_Decorator_Label , which he wrote after someone asked a question about how to do something in particular on the Zend MVC mail list. That guy is amazing.

XRPC_Server – A PHP XML-RPC Server Class

XRPC_Server is a simple as possible XML-RPC server component I wrote for a project a year or two ago, and is a good alternative if you want to try to stay away from large frameworks or complex components. The component is license under the MIT License, so you’re pretty much free to do whatever you want with it. All patches are welcome! The server requires PHP5 and reflection enabled.

Usage is as simple as requiring the php-file into your "gateway" page (the URL you’ll be calling from your XML-RPC clients), and then creating the server object with the functions you want to expose as the argument to the constructor:

$server = new XRPC_Server(array(
    'multiply' => 'test_xmlrpc_multiply',
    'dateTest' => 'test_xmlrpc_date_test',
    'assoc' => 'test_xmlrpc_assoc',
    'array' => 'test_xmlrpc_array',
));

A simple file illustrating the usage is also included, XRPC_Server_Test.php.

You can see the source code of the server at XRPC_Server.phps .

The class can be downloaded from this site: XRPC_Server.tar.bz2