svn: Can’t convert string from native encoding to ‘UTF-8’

The error “svn: Can’t convert string from native encoding to ‘UTF-8′” suddenly made it impossible to update one of the projects on our staging servers. The project contains loads of file under SVN control, and several data directories which up to this time wasn’t svn:ignore’d. One of the files in one of these directories had norwegian letters in ISO-8859-1 in its filename (which didn’t work in the project anyhow, as it was something left around from earlier).

This single file borked svn from actually being able to update or do anything useful with the actual files under SVN control. When Subversion analyzed the directory structure to check which files it should attempt to update, it would just barf before seeing any files with the error message about the file name not being in UTF-8. You’d think it would be better to ignore errors for filenames that aren’t a part of svn and that you’re not trying to add, but there’s probably a good reason for this behaviour.

Anyways: The solution: delete the file. We didn’t use it anyway. There’s also a good chapter in the SVN Book about localization issues which contain information about how you can solve the issue by changing your active character set.

gnome-web-photo segfaults (segment fault)! OH NOES!

We capture images from beautiful web pages all over the world by exposing the gnome-web-photo package through a simple web service. After moving the service to a new server today gnome-web-photo suddenly started segfaulting (aka segment fault).

Running the application as the same user as the web server worked (after fixing the home directory so that gconf etc was able to create its files), but when running in the web server process itself things segfaulted.

The next attempt was to run both the working and non-working version through strace and see what the difference were, and apparently things segfaulted when the working process accessed <home directory>.mozilla/. This was the first access to anything inside the home directory of the user, which provided the solution:

When the process was running under the web server, the HOME environment variable was not set, but while running under the user from the shell (through su -), it was present. gnome-web-photo (or Firefox?) apparently does not feature any sort of fallback if the HOME environment variable is missing and segfaults instead.

Maybe that could be a patch for the weekend, but hey, the Olympic Games are on!

Fixing Issue With PHPs SoapClient Overwriting Duplicate Attribute and Tag Names

The setting:

An SOAP request contains an Id attribute – and an element with the exact name in the response (directly beneath the element containing the attribute – an immediate child):


  foobar

The problem is that the generated result object from the SoapClient (at least of PHP 5.2.12) contains the attribute value, and not the element value. In our case we could ignore the z:Id attribute, as it was simply an Id to identify the element in the response (this might be something that ASP.NET or some other .NET component does).

Our solution is to subclass the internal SoapClient and handle the __doRequest method, stripping out the part of the request that gives the wrong value for the Id field:

class Provider_SoapClient extends SoapClient
{
    public function __doRequest($request, $location, $action, $version)
    {
        $result = parent::__doRequest($request, $location, $action, $version);
        $result = preg_replace('/ z:Id="i[0-9]+"/', '', $result);
        return $result;
    }
}

This removes the attribute from all the values (there is no danger that the string will be present in any other of the elements. If there is – be sure to adjust the regular expression). And voilá, it works!

A Simple Smarty Modifier to Generate a Chart Through Google Chart API

After the longest title of my blog so far follows one of the shortest posts.

The function has two required parameters – the first one is provided automagically for you by smarty (it’s the value of the variable you’re applying the modifier to). This should be an array of objects containing the value you want to graph. The only required argument you have to provide to the modifier is the method to use for fetching the values for graphing.

Usage:
{$objects|googlechart:”getValue”}

This will dynamically load your plugin from the file modifier.googlechart.php in your Smarty plugins directory, or you can register the plugin manually by calling register_modifier on the template object after you’ve created it.

function smarty_modifier_googlechart($points, $method, $size = "600x200", $low = 0, $high = 0)
{
    $pointStr = '';
    $maxValue = 0;
    $minValue = INT_MAX;
    
    foreach($points as $point)
    {
        if ($point->$method() > $maxValue)
        {
            $maxValue = $point->$method();
        }

        if ($point->$method() < $minValue)
        {
            $minValue = $point->$method();
        }
    }

    if (!empty($high))
    {
        $maxValue = $high;
    }

    $scale = 100 / $maxValue;

    foreach($points as $point)
    {
        $pointStr .= (int) ($point->$method() * $scale) . ',';
    }

    $pointStr = substr($pointStr, 0, -1);

    // labels (5)
    $labels = array();

    $steps = 4;
    $interval = $maxValue / $steps;

    for($i = 0; $i < $steps; $i++)
    {
        $labels[] = (int) ($i * $interval);
    }

    $labels[] = (int) $maxValue;

    return 'http://chart.apis.google.com/chart?cht=lc&chd=t:' . $pointStr . '&chs=' . $size . '&chxt=y&chxl=0:|' . join('|', $labels);
}

The function does not support the short version of the Google Chart API Just Yet (tm) as it is an simple proof of concept hack made a few months ago.

How To Dismantle An Atomic HTTP Query .. String.

Following up on yesterday’s gripe about PHPs (old and now useless) automagic translation of dots in GET and POST parameters to underscores, today’s edition manipulates the query string in place instead of returning it as an array.

This is useful if you have a query string you want to pass on to another service, and for some reason the default behaviour in PHP will barf barf and barf. That might happen because of the dot translation issue or that some services (such as Solr) rely on a parameter name being repeatable (in PHP the second parameter value will overwrite the first).

function http_dismantle_query($queryString, $remove)
{
    $removeKeys = array();

    if (is_array($remove))
    {
        foreach($remove as $removeKey)
        {
            $removeKeys[$removeKey] = true;
        }
    }
    else
    {
        $removeKeys[$remove] = true;
    }

    $resultEntries = array();
    $segments = explode("&", $queryString);

    foreach($segments as $segment)
    {
        $parts = explode('=', $segment);

        $key = urldecode(array_shift($parts));

        if (!isset($removeKeys[$key]))
        {
            $resultEntries[] = $segment;
        }
    }

    return join('&', $resultEntries);
}

I’m not really sure what I’ll call the next function in this series, but there sure are loads of candidates out there.

Getting Dots to Work in PHP and GET / POST / COOKIE Variable Names

One of the oldest and ugliest relics of the register_globals era of PHP are the fact that all dots in request variable names gets replaced with “_”. If your variable was named “foo.bar”, PHP will serve it to you as “foo_bar”. You cannot turn this off, you cannot use extract() or parse_str() to avoid it and you’re mostly left out in the dark. Luckily the QUERY_STRING enviornment (in _SERVER if you’re running mod_php, etc) contains the raw string, and this string contains the dots.

The following “”parser”” is a work in progress and does currently not support the array syntax for keys that PHP allow, but it solves the issue for regular vars. I will try to extend this later on to do actually replicate the functionality of the regular parser.

Here’s the code. No warranties. Ugly hack. You’re warned. Leave a comment if you have any good suggestions regarding this (.. or know of an existing library doing the same..).

function http_demolish_query($queryString)
{
    $result = array();
    $segments = explode("&", $queryString);

    foreach($segments as $segment)
    {
        $parts = explode('=', $segment);

        $key = urldecode(array_shift($parts));
        $value = null;

        if ($parts)
        {
            $value = urldecode(join('=', $parts));
        }

        $result[$key] = $value;
    }

    return $result;
}

(OK, that’s not the real function name, but it’s aptly named to be the nemesis of http_build_query)

Relevant Meta Tags for Facebook Share

I spent the evening making sure the different pages on Gamer.no gave relevant titles and descriptions when shared on Facebook. The implementation was quite straight forward, but finding the finding the actual documentation for which elements Facebook supports ate a bit of development time. After navigating through four or five wiki pages at the development wiki describing various parts of the Facebook Share system, I finally found the page getting down to the metal about which elements you should include.

The page can be found at Facebook Share – Specifying Meta Tags.

We’ve currently implemented title, description and medium. We also had image_src, but decided against it at the moment – the first feedback we reserved made it clear people preferred to select their own image. This may however be because of the first batch of people being a bit too technical competent, so we’ll probably use image_src later (.. does Facebook support providing several images through image_src?).

A Quick Introduction to chmod and Octal Numbers

Someone asked what the difference between doing a chmod 777 and chmod 755 is today, and hopefully this short, informal post will provide you with the answer (if you want to jump straight through to the conclusion, man chmod).

Octal Numbers

The number you provide as an argument to chmod is an octal number telling chmod what access you want to provide to a file (or a directory, device, etc – an entry on the file system). The number are in fact three discreet values, 7, 5 and 5. Each of the values correspond to a set of three bits, either one being zero or one. Three bits makes up a value from 0 – 7, hence an octal number (a decimal number has the digits 0 – 9 for each digit, an octal number has 0 – 7, a binary number has 0 – 1, a hexadecimal number has 0 – F (15)).

If you tried to count from 0 to 10 (decimal) in octal, it’d be: 0, 1, 2, 3, 4, 5, 6, 7, 10, 11, 12. 12 in octal is the same value as 10 in decimal. The big difference is that both octal and decimal maps very neatly on top of binary numbers, being exactly three or four bits.

The usual way to write an octal number in a programming language is by appending a zero in front of it, such as 0755. This tells the compiler that the number is written in octal notation, and the value is then parsed as such. chmod parses all numbers as octal, and does actually handle four digits. Since missing digits are considered to be zero, the first digit is usually not included (or simply as a zero – which will look the same as the representation used in certain programming languages). The first, usually unused digit, have a special meaning, setting the “set user id” (suid), “set group id” (guid) or the “restricted deletion” or “sticky” attributes (you can read more about these options in the manual page).

File permissions

Now that we know what an octal number is, it’s time to look at how the file permissions work. Each file has three sets of permissions, one set for the user owning the file, one set for the group owning the file and one set for anyone else. If you want to take a look at these values on a unix based system, simply type ls -l to list files in a verbose way. Your result will look something like:

-rw-r--r--  1 mats mats        35 2008-08-23 20:24 IMPORTANTFILE

The permissions are listed in the first column, containng “-rw-r–r–“. The first character “-” indicates if the file is a directory (d), if the suid or guid bits are set etc.

This leaves us with “rw-r–r–” – the three sets of permissions. “rw-” is for the user owning the file, “r–” is for the group owning the file and the last “r–” are for anyone else (or ‘other’ as it’s called). The “r” means read, the “w” means write and the currently missing letter is “x”, which means execute (for files) or search (for directories). The “execute” setting is used to let bash (or another shell) attempt to run the file as a script, attempting to parse the first line as a path to the interpreter for the file (i.e. #!/usr/bin/python).

We have three flags (read, write, execute) that can be either on or off. This should remind us of three bits, either being 0 (not set) or 1 (set). And an octal digit is exactly three bits. This means that an octal digit maps exactly to the bit sequence needed to set permissions for a file. A 7 is “111”, a 5 is “101”, a 4 is “100” and so on. Mapping this to permissions:

7 = 111 = rwx
6 = 110 = rw-
5 = 101 = r-x
4 = 100 = r--
3 = 011 = -wx
2 = 010 = -w-
1 = 001 = --x
0 = 000 = ---

When calling chmod 755 on a directory we’re telling chmod to “set the read, write and search bits for me, the read and search bits for the group and the read and search bits for other users” (‘search’ for directories, ‘execute’ for files).

Another example is 644 that maps to 110 100 100, which again maps to “rw-r–r–” which usually is the standard access mode for files (and 755 for directories).

Handling Permissions With Symbols

I’m now going to eliminate the need for remembering everything I’ve written so far in the post, but at least you’ll know what people are talking about when they’re telling you to chmod something this-or-that.

You can also use the symbols directly with chmod, either adding, removing or setting the permissions for one of the three groups.

Examples:

To remove all access for other users (but leaving group and user intact)
chmod o-rwx file

To give everyone read access
chmod a+r file

To give everyone read – and search – access
chmod a+rx directory

To set particular user modes for each group
chmod u=rw,g=w,o=w file (a file that the user can read, but anyone can write to)

And with that I chmod this post a+r.

Table 100% Width and Margins

While hacking aboooooot today I found the need for making a table behave like a regular block element. I have a section floated to the right of a table, and the table should occupy the rest of the available spot. Making the table’s width 100% would make it adjust it size according to its parent instead of the available place (.. with margin-right set to the width of the other element + whitespace).

The best solution I’ve found so far is to create a wrapper div around the table, and then setting the table width to 100%. This makes the table adjust its size according to the parent – which now is a block element.




...

Fixing Stuttering Audio With ffmpeg and Quicktime

We’re using ffmpeg to encode videos for flash on several of the sites I’m involved with, and this usually works flawless. From time to time there’s however certain video files that makes something go wrong, usually small issues like stuttering at random places. Today I decided to go bug hunting and try to find out exactly what triggered this behaviour in one of our recent videos.

After quite a bit of debugging and re-encoding the offending video segment (which limits the rate of debug attempts, as you have to wait a couple of minutes ++ for each encode), I decided to try to simply use -acodec copy for the audio codec. The Quicktime file already used AAC as its codec, and the file we were encoding also used AAC. The stuttering disappeared! This indicates that the sound encoding of the process were to blame for the stuttering, so if you’re having a sound related problem, try to copy the input codec to the output source.

As libfaac and libfaad were the two only involved libraries of the encoding and decoding process, the first thing to try were to check if there were any new versions of these libraries available. And lo’ and behold, both libfaac and libfaad had been updated since the versions included in our ubuntu version (no real shocker there, as things in the audio and video codec world moves with a rather high velocity).

I removed the packages (sudo apt-get remove libfaac-dev libfaad-dev) from Ubuntu, downloaded the new libfaac and libfaad versions, compiled (./configure && make) and installed them (sudo make install), and then recompiled ffmpeg with the new libraries in place. ffmpeg then complained about libfaac.so.2 missing, but a quick run of ldconfig (sudo ldconfig) fixed that issue.

Re-encoding the file yet again – and wooho! The offending file now works as it should. This probably also solve the issue we’ve been seeing for several other files too. The new versions of libfaac and libfaad solved the issues we were having.

BTW: There’s a small fix needed to make libfaac compile with gcc4.