Programming – Page 5

An Update to jQDynamicFontSize

When we released jQDynamicFontSize a couple of weeks ago, we hoped that others would find the plugin useful and keep it around as one of the many tools in your toolbox. We also had a small hope that people would find it useful to extend and maybe submit a patch or two back to us.

And lo’ and behold, during the weekend the first patch arrived in my mailbox. Written by Vegard Andreas Larsen, we now also support scaling against the width of a container. I didn’t even have an idea around this, and suddenly we have working code. The power of open source!

This adds two new options when initializing jqDFS:

limitWidth: Uses the width of the element to determine the size instead of the height. Defaults to false.
allowUpscaling: Allows the element to grow instead of shrink to fit the provided area. Only works when limitWidth is active currently. Defaults to false.

The original scale method should probably be rewritten to also use allowUpscaling, so if anyone feels slightly hackish tonight, just send the patch my way!

Introducing jQDynamicFontSize – A jQuery Plugin for Dynamic Font Size

We’ve released the first version of jQDynamicFontSize – a jQuery Plugin for dynamically adjusting the font size of an element to fit a number of lines. The plugin was written to allow us to resize a headline to make the headline fit on one line, sacrificing text size to avoid breaking text into two lines.

To paraphrase from the INSTALL file:

In a production system, use jquery-dynamicfontsize.min.js. For debugging
or developing, use jquery-dynamicfontsize.js.

Usage:

Include a reference to the script after loading jQuery:

<script src="jquery-dynamicfontsize.min.js"></script>

Then call:

$("#idOfElement").dynamicFontSize();

This will attempt to scale the font size of the element down with 10% in
3 iterations, stopping when a value has been found that allows the element
to only use one text line.

$("h1").dynamicFontSize();

This will attempt to scale all h1 elements. Other jQuery selectors will also
work.

Options supported:

    * squeezeFactor: A float value that will be used as the squeeze factor
      for each step. 0.1 means that we'll attempt to scale the font-size down
      10% for each iteration. Defaults to 0.1.
    * lines: The number of lines we'll attempt to fit the text to. When the
      text fits this (or a smaller) amount of lines, we'll stop scaling.
      Defaults to 1.
    * tries: The number of iterations we'll try before we give up and go with
      the last result. Defaults to 3.

We do currently not care for the line-height of elements, so if you’re feeling slightly hackish tonight, feel free to add the required piece of javascript goodness. Any suitable patches are welcome!

Borked Behaviour for the Back-button in Firefox

I investigated a strange problem yesterday, where the back button in Firefox returned the user to the top of the previous page, instead of to the location where he already had scrolled. The problem seemed to have brought its fair share of problems for developers all over, and a thread detailing the problem in Drupal provided the information needed to solve it. The problem is actually so wide-spread that there is a dedicated Firefox extension to solve the issue (Restore Scroll Position).

Anyways, the issue stems from the Cache-Control headers that PHP among others include by default:

Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0

The problem is that the “no-store” directive tells the browser do NOT store a version of the page anywhere, not temporarily, not .. ever. Internet Explorer and Opera still remembers the position, but Firefox decided to take everything a step further and does not keep any information available. The extension mentioned above saves the scroll position in another location and then restores the scroll position after navigating back to the page.

The problem is solved by changing the Cache-Control header:

Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0

A very helpful tip here is that you probably need to restart Firefox to make it respect the new header, as it will keep its old behavior until you restart the browser (at least for Firefox 3.0.6).

New Adventures in Reverse Engineering

Before I go into the gory details of this post, I’ll start by saying that this method is probably not the right solution for you. This is not something you want to do if you have readily access to any source code or if you have an existing relationship with the 3rd party that provided the library you’re using. Do not do this. This is not for you.

With that out of the mind, this is the part for those who actually are interested in getting down and dirty with Java, and maybe solving a problem that’s hard to solve otherwise.

The setting: We have a library for interfacing with another internal web service, where the library was provided in binary form by a 3rd party as part of the agreement when the service were delivered to us. The problem is that due to some unknown matter, this library is perfectly capable of understanding UTF-8, both as input from us and as input from the web service, but all web related methods in the result class returns data encoded as ISO-8859-1. The original solution to this was to keep two different parts of the query string — the original query in one particular key — and the key for the library in ISO-8859-1. This needs loads of special casing, manually handling that single parameter, etc. This works to a certain degree as long as the library is the only component in the mix. The troubles really began to surface when we started querying other services based on the same query. We’d then have to special case all methods that were used in URLs, as they returned ISO-8859-1 — and all other libraries and encodings are using UTF-8.

The library has since been made into a separate product with a hefty price tag, so upgrading the library was not an acceptable solution for us. Another solution had to be found, and this is were things starts to get interesting.

Writing a proxy class to handle the encoding issue transparently

This was the solution we attempted first, but this requires us to implement quite a few methods, to add additional code to the method that provides access to the library and to extend and embrace parts of the object. This could have been done quite easily by simply changing one method of the class to reference super.methodName() and then returning that result, but as we have to change several classes (these objects live 3-4 levels down into the result object from the library) which add both developer and runtime overhead. Not good.

Decompiling the library

The next step was to decompile the library to see how the code of the library actually worked. This proved to be a good way to find out how we could possibly solve the issue. We could try to fix the issue in the code and then recompile the library, but some of the class files were too new for jad to decompile them completely. The decompilation did however show the problem with the code:

    if (encoding != null)
    {
        return encoding.toString() :
    }

    return "ISO-8859-1";

This was neatly located in a helper method that ran on every property used when generating a query string. The encoding variable is retrieved from a global settings object, only accessible in the same library. This object is empty in our version of the library, so not much help there. But here’s the little detail that leads into the next part, and actually made this hack possible: “ISO-8859-1” is constant. This means that it gets neatly tucked away as an UTF-8 string when the class file is generated. Let’s gets down and dirty.

Binary patching the encoding in the class file

We’ll start by taking a look at the hexdump in our class file, after searching for the string “ISO” in the ASCII representation (“ISO” in UTF-8 is identical to the ASCII representation):

Binary Patching a Java Class

I’ve highlighted the interesting part where “ISO-8859-1” is stored in the file. This is where we want to do our surgical incision and make the method return the string “UTF-8” instead. There is one important thing you should be aware of if you’ve never done any hex editing of files before, and that is the fact that the byte offset of parts of the file may be very important. Sadly, the strings “UTF-8” and “ISO-8859-1” have different lengths, and as such, would require us to either delete bytes following “UTF-8” or put spaces there instead (“UTF-8 “). The first method might leave the rest of the file skewed, the latter might not work if the method used for encoding the value doesn’t trim the string first.

To solve this issue, we turn to our good friend VM Spec: The class File Format, which contains all the details of how the class file format is designed. Interesting parts:

In the ClassFile structure:

cp_info constant_pool[constant_pool_count-1];

As we’re looking at a constant, this is where it should be stored. The cp_info is defined as:

cp_info {
    u1 tag;
    u1 info[];
}

The tag contains the type of constant, the info[] array varies depending on the type of the constant. If we take a look at the table in Chapter 4.4, we see that the the identifier for a unicode string is:

CONSTANT_Utf8 	1

So we should have the value of 1 in the byte (as the actual value, not the ascii character) describing this constant. If the value is one, the complete structure is:

    CONSTANT_Utf8_info {
    	u1 tag;
    	u2 length;
    	u1 bytes[length];
    }

The tag should be 1 as the byte value, the length should be two bytes describing the length of the actual string saved (since we’re storing the length in two bytes (u2), it can be a maximum of 2^16 bytes in total). After that identifier, we should have length number of bytes with UTF-8 data.

If we go back to our hex dump, we can now make more sense of the data we’re seeing:

The byte shown as 0x01 in hex is the value 1 for the tag of the structure. The 0x00 0x0A is the two bytes making up the length of the string:

    0000 0000 0000 1010 binary = 10 decimal

    ISO-8859-1
    1234567890

This shows that the length of our string “ISO-8859-1” is 10 bytes in UTF-8, which is the same value that is stored in the two bytes showing the length of the string in the structure.

Heading back to our original goal: changing the length of the string stored. We change the string bytes to “UTF-8”, which is five bytes. We then change the stored length of the string:

    00 0A becomes
    00 05

We save our changes and re-create the jar file again, with all the previous classes and our changed one.

After inserting our new JAR-file into our maven repository as a new build and updating our local repository, we now have complete UTF-8 support from start to finish. Yey!

ImportError: No module named trac.web.modpython_frontend

One of the reasons why you might get the error:

ImportError: No module named trac.web.modpython_frontend

after installing Trac is because of the fact that apache may not be able to create the Python egg cache, which is detailed in the Trac wiki right here. This will also generate the above error if not set up correctly. Create a directory for the files, change the owner to www-data.www-data (or something else, depending on which user you run Trac under) and rejoice.

The settings needed in the vhost configuration (.. or wherever you have your configuration ..):

    
        SetHandler mod_python
        PythonInterpreter main_interpreter
        PythonHandler trac.web.modpython_frontend
        PythonOption TracEnvParentDir /path/to/trac
        PythonOption TracUriRoot /
        PythonOption PYTHON_EGG_CACHE /path/to/directory/you/created

You can easily do a quick test by setting the path to /tmp and checking if that solved your problem. If it did, create a dedicated directory and live happily ever after. If it didn’t, continue your quest. Check for genshi and other dependencies. Do a search on Google ™.

Hopefully everything works again.

BTW: Another reason for this error might be that your trac installation may no longer be available (if your installation uses a version number in the library path and you upgraded the python version, this path will change – and your old libraries may not have been copied over), so it might help reinstalling Trac in your new environment:

easy_install -U Trac

.. and then try again (thanks to Christer for reporting on this after he had the same problem).

Communicating The Right Thing Through Code

While trying to fix a larger bug in a module I never had touched before, I came across the following code. While technically correct (it does what it’s supposed to do, and ~~there is no way to currently get it to do something wrong~~ (.. an update on just that), does have a serious flaw:

$result = get_entry($id);
if(is_bool($result))
{
    die('bailing out');
}

Hopefully you can see the error in what the code communicates; namely that the return type from the function is used to what should be considered an error.

While this works as the only way the function can return a boolean value is if it returns false, the person reading the code at a later date will wonder what the code is supposed to do – he or she might not have any knowledge about how the method works. Maybe the method just sets up some resource, a global variable (.. no, don’t do that. DON’T.), etc, but the code does not communicate what we really expect.

As PHP is dynamically typed, checking for type before comparing is perfectly OK, as long as you’re not counting “no returned elements” as an error. The following code more clearly communicates it intent (=== in PHP is comparison based on both type and content, which means that both the type of the variable and the content of it have to match. 0 === “0” will be considered false.):

$result = get_entry($id);
if($result === false)
{
    die('bailing out');
}

Or if you’re interested in getting to know if the element returned is actually considered false (such as an empty array, an empty string, etc), just drop one of the equal signs:

$result = get_entry($id);
if($result == false)
{
    die('bailing out');
}

I’m also not fond of using die() as a method for stopping a faulty request, as that should be properly logged and dealt with in the best manner possible, but I’ll leave that for a later post.

Releasing Keepitclose Alpha

I’ve just created a small project site over at Google Code for Keepitclose. Keepitclose is a HTTP local caching web service written in PHP meant to cache resources retrieved from external and internal web services. It currently supports both an APC backend (using shared memory in the web server itself) and using Memcache (which allows you to have several frontends using one or several backend memcache servers).

Missing features are currently:

Keeping HTTP headers
Adding cache information to HTTP headers
A file based backend
Everything else you can think of

The server do however support:

Time to live (TTL) for a cached resource when stored
TTL Window, i.e.:

A new version should be fetched at regular intervals, such as every sixth hour. New data are available at 00:00, 06:00, 12:00 and 18:00. Use ttlWindowSize=21600 to get six hours window size. Use ttlWindowOffset to add a offset; ttlWindowOffset=900 adds fifteen minutes to the value and retrives new content at 00:15, 06:15, 12:15 and 18:15.

If you want to implement your own backend, implement the Keepitclose_Storage_Interface and add an entry into the config.php file for your module, i.e. ‘file’ => array(‘path’ => ‘/tmp’).

The current version also supports simple access control using the allowedHosts entry in the config file. This entry contains one row for each allowed host as a regular expression:

    'allowedHosts' => array(
        '^127\\.0\\.0\\.1$',
    ),

.. will only allow requests from localhost. To allow a subnet, you could write ‘^127\\.0\\.0\\.[0-9]+$’.

You can also add several memcache servers, a request will then poll a random server to retrieve the resource. If not found, the content will be retrieved from the web and stored to all memcache servers. This is useful for environments with a very heavy read load.

'storage' => array(
    'memcache' => array(
        array(
            'host' => 'tcp://127.0.0.1',
        ),
    ),
),

To add more memcache servers, simply add another array() of memcache entries:

'memcache' => array(
    array(
        'host' => 'tcp://127.0.0.1',
    ),
    array(
        'host' => 'tcp://127.0.0.2',
    ),
),

You can also set ‘port’ and ‘persistent’ for the memcache connections.

Please leave a comment here or create an issue ticket on the google code page if you need any help or have any suggestions. All patches are welcome.

Adventures in OpenID and Zend Framework

I’ve been toying around with OpenID and the Zend Framework for a night or two now, and I’ve made a few experiences I thought I should share with the intarwebs (now, that’s probably the point where you should make the decision to stop reading for most blog posts). Quite some time has passed since I last had anything to do with OpenID, so just getting up and running was a challenge.

An OpenID identifier is usually represented by an URL (such as https://me.yahoo.com/<login>), which the OpenID consumer then contacts to get information about how to communicate with the OpenID identity provider (Yahoo! in this case). The consumer contacts the provider, gets an URL to redirect the client to, and receives notice after the client has authenticated with the provider.

First I’d like to say that OpenID seems to be too hard to use for any other than those who have a particular interest in it. I have a Yahoo! account and a Google Account, which both can be used for OpenID authentication. I have no idea how I use my Google Account for this, without having to provide endpoints manually. Ugly.

I did at least get the Yahoo! authentication working, but I’m still undecided on wether I’m going to implement OpenID support in any current project. Possibly. We’ll see.

Anyways, my implementation in Zend Framework is mostly a copy of the tutorial in the ZF manual, but there is one important point that they do not mention: In the standard installation, you have to use Zend_Session to handle your sessions. That means calling Zend_Session::start() instead of session_start(), as Zend_Session cannot be used after a session has been started. This dependency kind of killed my enthusiasm, as we just pull the parts of Zend Framework that we need into our project as thing progresses. Changing how we use sessions is a bit too much to ask. Luckily you can still use $_SESSION as usual after starting Zend_Session, but sitll. Not too fond of that. I hope that it will be decoupled some time in the future..

Testing code:

require_once 'Zend/OpenId/Consumer.php';

$consumer = new Zend_OpenId_Consumer();

if (!empty($_POST['openid_identity']))
{
    if(!$consumer->login($_POST['openid_identity']))
    {
        die("OpenID login failed.");
    }
    else
    {
        print('We logged in!');
    }
}

if (isset($_GET['openid_mode']))
{
    switch($_GET['openid_mode'])
    {
        case 'id_res':
            $consumer = new Zend_OpenId_Consumer();
            $id = false;
            
            if ($consumer->verify($_GET, $id))
            {
                $status = "VALID " . htmlspecialchars($id);
            }
            else
            {
                $status = "INVALID " . htmlspecialchars($id);
            }

            print($status);
            break;

        case 'cancel':
            print("someone pressed cancel!");
            break;
    }
}

Switch out $_POST[‘openid_identity’] with your OpenID identifier (the whole URL), and you should be all set.

If you keep getting failed logins without a redirect, check that you have https support in PHP through openssl (the module is named php_openssl). Zend Framework provides no hint that this can be a problem, but after stepping through the source (I’m test driving NetBeans 6.5) the solution became apparent.

PHP and Annotations

After hacking together some code to solve an issue that came up on an IRC channel I’m up today about how to provide a URL mapping for individual methods — and keeping the responsibility in the method itself, I stumbled across addendum. Addendum implements annotations parsing for PHP and works by using the reflection API in PHP 5.1+. This allows you to add annotations which indicate to your framework which methods should be exposed to the web and which should be kept private. There are loads of other ways to do this (both dynamically and statically), but this is one way that may appeal to someone.

PHP Namespaces and the What The Fuck problem

As many people has discovered during the last days, some of the developers behind PHP has finally reached a decision regarding how to do introduce namespaces into PHP. This is an issue which has been discussed on and off for the last three years (or even longer, I can’t really keep track), with probably hundreds of threads and thousand of mail list posts (read through the php.internals mail list archive if you’re up for it). The current decision is to go with \ as the namespace separator. An acceptable decision by itself, and while I personally favored a [namespace] notation, I have no reason to believe that this is not a good solution.

There does however seem to be quite a stir on the internet in regards to decision, which I now will call the “What The Fuck Problem”. Most (if not all) public reactions on blogs, reddit, slashdot and other technology oriented sites can be summed up as “What The Fuck?”. While I’m not going to dwell into the psychological reasons for such a reaction, my guess is that it’s unusual, lacks familiarity for developers in other languages already supporting namespaces and that it might be hard to understand the reasoning behind such a decision.

The problem: to retrofit namespaces into a language which were not designed to support such a construct, without breaking backwards compability. The part about not breaking backwards compability is very important here, as it leaves out everything which could result in a breakage in existing code by simply using a new PHP version.

The discussions have been long, the attempts has been several (thanks to Greg Beaver’s repeated persistence in writing different implementations along the way) and each and every single time an issue has crept in which either breaks existing functionality or results in an ambigiuity when it comes to resolving the namespace accessed. Most issues and the explaination why they are issues, are documented at Namespace Issues in the PHP Wiki. This provides some insight into why the decision to use a separate identifier was chosen.

This seemed to get through in the ~~flamewars~~ discussions after a while, and people instead started to point out the “gigantic flaw”:

If you were to load a class or a namespace dynamically by referencing it in a string, you’d have to take care to escape your backslash:

spl_autoload_register(array("Foo\tBar", "loader"));

.. would mean Foo<tab>Bar. Yep. It actually would. And if that’s the _biggest_ problem with the implementation of using \ as a namespace separator, then I feared its introduction earlier for no apparent reason.

The other examples has plagued us with ambiguity issues, autoloading issues, no-apparent-way-of-putting-functions-and-constants-in-the-namespace-issues and other problems. This way we are left with the task that we, as usual, have to escape \ in a string — when we want to reference a namespace in a dynamic name — and that’s the biggest problem?

I just hope that people keep it sane and don’t implement any special behaviour in regards to how strings are parsed in regards to new, $className::. This _will_ cause problems and magic issues down the road if it ever gets into the engine.

PHP is free, it’s open and it’s yours for the taking. Fork it if you want to, or provide a patch which solves the problem in a better way. The issue has been discussed to death, and so far there is no apparent better solutions on the table. If you have one, you’ve already had three years to suggest it. You better hurry if you’re going to make people realize it now.

Additional observation: most people who has an issue with this, seems to be the same people who would rather be caught dead than writing something in PHP. Yes. Python and Java does it prettier. That’s not a solution to the problem discussed in regards to PHP.