January 29th, 2010
After the longest title of my blog so far follows one of the shortest posts.
The function has two required parameters – the first one is provided automagically for you by smarty (it’s the value of the variable you’re applying the modifier to). This should be an array of objects containing the value you want to graph. The only required argument you have to provide to the modifier is the method to use for fetching the values for graphing.
Usage:
{$objects|googlechart:”getValue”}
This will dynamically load your plugin from the file modifier.googlechart.php in your Smarty plugins directory, or you can register the plugin manually by calling register_modifier on the template object after you’ve created it.
-
function smarty_modifier_googlechart($points, $method, $size = "600×200", $low = 0, $high = 0)
-
{
-
$pointStr = '';
-
$maxValue = 0;
-
$minValue = INT_MAX;
-
-
foreach($points as $point)
-
{
-
if ($point->$method() > $maxValue)
-
{
-
$maxValue = $point->$method();
-
}
-
-
if ($point->$method() < $minValue)
-
{
-
$minValue = $point->$method();
-
}
-
}
-
-
if (!empty($high))
-
{
-
$maxValue = $high;
-
}
-
-
$scale = 100 / $maxValue;
-
-
foreach($points as $point)
-
{
-
$pointStr .= (int) ($point->$method() * $scale) . ',';
-
}
-
-
$pointStr = substr($pointStr, 0, -1);
-
-
// labels (5)
-
$labels = array();
-
-
$steps = 4;
-
$interval = $maxValue / $steps;
-
-
for($i = 0; $i < $steps; $i++)
-
{
-
$labels[] = (int) ($i * $interval);
-
}
-
-
$labels[] = (int) $maxValue;
-
-
return 'http://chart.apis.google.com/chart?cht=lc&chd=t:' . $pointStr . '&chs=' . $size . '&chxt=y&chxl=0:|' . join('|', $labels);
-
}
The function does not support the short version of the Google Chart API Just Yet ™ as it is an simple proof of concept hack made a few months ago.
Tags: charts, google chart, google chart api, Hacks, PHP, Programming, smarty
Posted in Hacks, PHP, Programming | No Comments »
January 28th, 2010
Following up on yesterday’s gripe about PHPs (old and now useless) automagic translation of dots in GET and POST parameters to underscores, today’s edition manipulates the query string in place instead of returning it as an array.
This is useful if you have a query string you want to pass on to another service, and for some reason the default behaviour in PHP will barf barf and barf. That might happen because of the dot translation issue or that some services (such as Solr) rely on a parameter name being repeatable (in PHP the second parameter value will overwrite the first).
-
function http_dismantle_query($queryString, $remove)
-
{
-
$removeKeys = array();
-
-
if (is_array($remove))
-
{
-
foreach($remove as $removeKey)
-
{
-
$removeKeys[$removeKey] = true;
-
}
-
}
-
else
-
{
-
$removeKeys[$remove] = true;
-
}
-
-
$resultEntries = array();
-
$segments = explode("&", $queryString);
-
-
foreach($segments as $segment)
-
{
-
$parts = explode('=', $segment);
-
-
$key = urldecode(array_shift($parts));
-
-
if (!isset($removeKeys[$key]))
-
{
-
$resultEntries[] = $segment;
-
}
-
}
-
-
return join('&', $resultEntries);
-
}
I’m not really sure what I’ll call the next function in this series, but there sure are loads of candidates out there.
Tags: dots, get, Hacks, http_dismantle_query, PHP, POST, query string, Solr, underscores
Posted in Hacks, PHP, Programming, Solr | No Comments »
January 27th, 2010
One of the oldest and ugliest relics of the register_globals era of PHP are the fact that all dots in request variable names gets replaced with “_”. If your variable was named “foo.bar”, PHP will serve it to you as “foo_bar”. You cannot turn this off, you cannot use extract() or parse_str() to avoid it and you’re mostly left out in the dark. Luckily the QUERY_STRING enviornment (in _SERVER if you’re running mod_php, etc) contains the raw string, and this string contains the dots.
The following “”parser”" is a work in progress and does currently not support the array syntax for keys that PHP allow, but it solves the issue for regular vars. I will try to extend this later on to do actually replicate the functionality of the regular parser.
Here’s the code. No warranties. Ugly hack. You’re warned. Leave a comment if you have any good suggestions regarding this (.. or know of an existing library doing the same..).
-
function http_demolish_query($queryString)
-
{
-
$result = array();
-
$segments = explode("&", $queryString);
-
-
foreach($segments as $segment)
-
{
-
$parts = explode('=', $segment);
-
-
$key = urldecode(array_shift($parts));
-
$value = null;
-
-
if ($parts)
-
{
-
$value = urldecode(join('=', $parts));
-
}
-
-
$result[$key] = $value;
-
}
-
-
return $result;
-
}
(OK, that’s not the real function name, but it’s aptly named to be the nemesis of http_build_query)
Tags: cookie, dots, Hacks, PHP, POST, Programming, request variables, underscore
Posted in Hacks, PHP, Programming | No Comments »
January 24th, 2010
As we’ve recently added support for querying Solr servers in parallel, one of the things we added was a simple class to allow us to query several servers at the same time. The CURL library (which has a PHP extension) even provides an abstraction layer for doing the nitty gritty work for you, as long as you keep track of the resources. The code beneath is based on examples in the documentation and a few tweaks of my own.
The code beneath is licensed under a MIT license. You can also download the file (gzipped).
-
class Footo_Content_Retrieve_HTTP_CURLParallel
-
{
-
/**
-
* Fetch a collection of URLs in parallell using cURL. The results are
-
* returned as an associative array, with the URLs as the key and the
-
* content of the URLs as the value.
-
*
-
* @param array<string> $addresses An array of URLs to fetch.
-
* @return array<string> The content of each URL that we've been asked to fetch.
-
**/
-
public function retrieve($addresses)
-
{
-
$multiHandle = curl_multi_init();
-
$handles = array();
-
$results = array();
-
-
foreach($addresses as $url)
-
{
-
$handle = curl_init($url);
-
$handles[$url] = $handle;
-
-
curl_setopt_array($handle, array(
-
CURLOPT_HEADER => false,
-
CURLOPT_RETURNTRANSFER => true,
-
));
-
-
curl_multi_add_handle($multiHandle, $handle);
-
}
-
-
//execute the handles
-
$result = CURLM_CALL_MULTI_PERFORM;
-
$running = false;
-
-
// set up and make any requests..
-
while ($result == CURLM_CALL_MULTI_PERFORM)
-
{
-
$result = curl_multi_exec($multiHandle, $running);
-
}
-
-
// wait until data arrives on all sockets
-
while($running && ($result == CURLM_OK))
-
{
-
if (curl_multi_select($multiHandle) > -1)
-
{
-
$result = CURLM_CALL_MULTI_PERFORM;
-
-
// while we need to process sockets
-
while ($result == CURLM_CALL_MULTI_PERFORM)
-
{
-
$result = curl_multi_exec($multiHandle, $running);
-
}
-
}
-
}
-
-
// clean up
-
foreach($handles as $url => $handle)
-
{
-
$results[$url] = curl_multi_getcontent($handle);
-
-
curl_multi_remove_handle($multiHandle, $handle);
-
curl_close($handle);
-
}
-
-
curl_multi_close($multiHandle);
-
-
return $results;
-
}
-
}
Download the file.
Tags: curl, fetching, footo, parallel, PHP, php_curl, retrieving
Posted in PHP, Programming | No Comments »
January 23rd, 2010
When defining strings in programming languages, they’re usually delimited by ” and “, such as “This is a string” and “Hello World”. The immediate question is what do you do when the string itself should contain a “? “Hello “World”" is hard to read and practically impossible to parse for the compiler (which tries to make sense out of everything you’ve written). To solve this (and similiar issues) people started using escape characters, special characters that tell the parser that it should pay attention to the following character(s) (some escape sequences may contain more than one character after the escape character).
Usually the escape character is \, and rewriting our example above we’ll end up with “Hello \”World\”". The parser sees the \, telling it that it should parse the next characters in a special mode and then inserts the ” into the string itself instead of using it as a delimiter. In Java, C, PHP, Python and several other languages there are also special versions of the escape sequences that does something else than just insert the character following the escape character.
\n – Inserts a new line.
\t – Inserts a tab character.
\xNN – Inserts a byte with the byte value provided (\x13, \xFF, etc).
A list of the different escape sequences that PHP supports can be found in the PHP manual.
Anyways, the issue is that Java found an escape sequence that it doesn’t know how to handle. Attempting to define a string such as “! # \ % &” will trigger this message, as it sees the escape character \, and then attempts to parse the following byte – which is a space (” “). The escape sequence “\ ” is not a valid escape sequence in the Java language specification, and the parser (or NetBeans or Eclipse) is trying to tell you this is probably not what you want.
The correct way to define the string above would be to escape the escape character (now we’re getting meta): “! # \\ % &”. This would define a string with just a single backlash in it.
Tags: defining, eclipse, escaping, illegal escape character, Java, netbeans, strings
Posted in Java, Programming | No Comments »
January 21st, 2010
When creating a simple mash-up with data from external sources, you usually want to read the data in a suitable format – such as JSON. The tool for the job tends to be javascript, running in your favourite browser. The only problem is that requests made with XHR (XMLHttpRequest) has to follow the same origin policy, meaning that the request cannot be made for a resource living on another host than the host serving the original request.
To get around this clients usually use JSONP – or a simple modification of the usual JSON output. The data is still JSON, but the output also includes a simple callback at the end of the request, triggering a javascript in the local browser. This way the creator of the data actually tells the browser (in so many hacky ways) that it’s OK, I’ve actually thought this through. Help yourself.
In jQuery you can trigger the usual handling of events by using “?” as the name of your callback function. jQuery will handle this transparently and then trigger the function you provided to .getJSON in the first place.
Example
-
url = "http://feeds.delicious.com/v2/json/recent?callback=?";
-
-
$.getJSON(url, function(data) { alert(data); });
There’s an article up at IBM’s developerWorks giving quite a few more examples and information about the issue.
Tags: getjson, javascript, jquery, json, jsonp, same-origin
Posted in javascript, Programming | 1 Comment »
January 11th, 2010
We’re currently expanding our munin reporting cluster at Derdubor, but after installing munin-node on one of our servers we never got any graphs. The only section available on the munin server was “Other”, and that didn’t contain any information at all (which indicates that you’re not getting any response from the server).
The first step I make when trying to debug a munin connection is to telnet into the munin port, as this confirms that the two servers are able to talk to each other and that the munin daemon listens to the correct interface and port.
# telnet localhost 4949
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.
#
The connection was established, but then munin closed the connection as soon as it was created. This usually means one thing: the host you’re connecting from isn’t added to the cidr_allow list or the allow list, or in the denied hosts list. This time it meant neither, the host was added and we didn’t have any denied hosts list.
The next step was to take a look at the munin-node.log in /var/log/munin (at least under under debian).
The last message was:
User "ejabberd" in configuration file "/etc/munin/plugin-conf.d/munin-node" nonexistant. Skipping plugin. at /usr/sbin/munin-node line 615, line 83.
Something wicked happened while reading "/etc/munin/plugins/munin-node". Check the previous log lines for spesifics. at /usr/sbin/munin-node line 261, line 83.
We don’t have ejabberd installed, but the ejabberd config reference was apparently added to the configuration file in /etc/munin/plugin-conf.d/munin-node. This made our version of munin-node barf, as the user it reference wasn’t available.
Next step was to remove the section from the file and restarting munin-node:
/etc/init.d/munin-node restart
After restarting munin, I did the telnet check again:
# telnet localhost 4949
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
# munin node at example.com
.
fetch load
load.value 0.02
.
quit
Connection closed by foreign host.
#
Wait 10 – 15 minutes and you should start seeing graphs again – if this actually were your problem. Probably not (and then you should probably read Debuggning Munin Plugins and other documentation on the Wiki). But if it were, you’ll be happy happy joy joy now.
Tags: administration, ejabberd, munin, munin-node, server administration
Posted in Programming, Scalability | No Comments »
January 10th, 2010
While playing around with one of my development SOLR installations (this time under Windows), I suddenly got a weird error message when feeding data to one of the fresh cores.
SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.SimpleFSDirectory@C:\temp\solr\*\data\index: files:
Taking a look at the contents of the index\ directory, it was in fact empty. Seems weird, but my initial guess was that Lucene / SOLR would treat this as a new installation and create the files.
Turns out the issue is that it won’t – as long as the index directory exists, Lucene / SOLR goes looking for the segment files.
Thanks to an old post to the solr-dev list by Yonik, the easiest fix is to simply delete the index directory and restart your applet container (Tomcat in this case).
Tags: apache tomcat, exception, index directory, Java, lucene, Solr, solr-dev
Posted in Apache, Programming, Solr | 18 Comments »
January 9th, 2010
I have to admit something. I’ve become addicted.
One of the things I finally got around to doing while living the quiet life over the christmas holiday was to dive a bit further into Munin – a simple framework for collecting information from your computers and servers and making nice graphs that you can watch while you’re bored.
I’m not going to write a lot about how you can create your own Munin plugin to create your own graphs, as they have a very simple tutorial giving you all the basics about writing Munin plugins themselves. The only thing you need to remember are these two tidbits:
- When Munin first registers your plugin, it runs your script with config as the only argument. This provides Munin with the name of the graph, the labels and names (keys) of the graphs you’re providing values for, information about the axis, etc.
- When Munin runs your script without the config argument, it expects you to give it values for the keys you provided it in the configuration.
You enable and disable plugins by creating symlinks in /etc/munin/plugins (at least under debian / ubuntu), and plugins are usually stored in /usr/share/munin/plugins.
I keep my plugins archived together with the rest of the repository for my web projects, and then either symlink the content into the plugins-directory or create a simple wrapper script that changes the current directory to the location of the script and then invokes it (to make the current working directory be correct).
A very simple bash script that does this – and passes through any parameters given to the script:
-
#!/bin/bash
-
cd <absolute path> && php ./<script name> "$@"
An example of a simple PHP script to provide information to Munin:
-
<?php
-
if ((count($argv) > 1) && ($argv[1] == 'config'))
-
{
-
print("graph_title THE TITLE OF YOUR GRAPH
-
graph_category THE CATEGORY / GROUP OF YOUR GRAPH
-
graph_vlabel Count
-
total.label Total
-
other.label Other
-
");
-
exit();
-
}
-
-
print('total.value ' . get_total_value() . "\n");
-
print('other.value ' . get_other_value() . "\n");
Symlink everything, check that it runs properly when you execute the script from the plugins directory:
mats@xx:/usr/share/munin/plugins$ ./scriptname
total.value 37
other.value 13
mats@xx:/usr/share/munin/plugins$
Symlink it into the /etc/munin/plugins directory and reload or restart Munin.
To check that Munin runs your script properly, telnet into the Munin server from an approved host and type “fetch
“. You should now see the same output as you got when you simply typed ./scriptname in the plugins directory.
If stuff doesn’t work and you’re having a hard time finding out why, be sure to check out the munin-node logfile: /var/log/munin/munin-node.log.
As soon as you have the basics down, you’re free to start graphing whatever numeric value you can think of. The most interesting uses are probably something that integrates with your web applications, such as the number of searches, the number of signed up users, the language selection of users, the popularity of certain categories, etc. The possibilities are endless, use your imagination!
And about the addiction: NEED MORE GRAPHS.
Tags: bash, creating, graphs, munin, PHP, plugins, rrdtool
Posted in Hacks, PHP, Programming, pwned.no, Scalability | 2 Comments »
January 6th, 2010
There’s a few easy changes you can do to your website setup to speed up content delivery and eat up less bandwidth: configure proper expire values and if possible, keep your static resources on a separate domain.
The HTTP Expires Header
Expires tells the client how long it can keep the current version of a resource as the most recent one. If you set the Expires-header a while into the future, the browser will not make a new request for the file until the resource, well, expires (depending on the cache settings for the browser, requesting a reload (such as shift-reloading in a browser), etc. which can expire the resource earlier). The potential problem is the case where a resource actually changes, such as deploying a change to your stylesheet or external javascript files.
The fix for this is to include something about the file which changes when the file is physically updated on the disk. This can be the last modified time (please keep this cached in your web application, you do not want to hit the disk to retrieve the value for each page view), the current revision number from your revision control system (such as SVN – you can get the current revision of a file by using svn info, and please, cache that value to. You do not want to call svn for each page view :-)) or something else, such as the md5 or crc32 hash of the file. The important part is that you include this value as part of the request, making the URL to the resource unique depending on the version of the resource. You can safely ignore this part of the URL in your rewrite / controller routing magic / handling application, as the only function it has is to tell the browser that it has to request a new file and not use the old one anymore.
Examples of URL-schemes To Get Around Expires:-headers
- flickr uses as simple .v in their URLs to indicate the version of the file: http://l.yimg.com/g/css/c_sets.css.v74709.14
- On Gamer.no we use the current SVN revision: /css/main.css?v=1120M
- vg.no uses the current date, followed with an identifier that probably indicates the current revision for that day: css/frontpage.css?20091203-1
It’s important to remember that the identifier is not used to deliver an older version of the file depending on the parameter, just to make the browser see the new resource. The old URL can still serve the new resource – and if you need to keep old versions around, you’ve probably solved this issue already.
Use a Separate Domain for Static Resources
By using another, separate domain for your static resources, you’re letting browsers fetch the static resources while they’re still processing your HTML. The HTTP/1.1 specification says that browsers never should request more than two files at the same time from the same domain. When you host your static resources on another domain, you tell the browser that it can go ahead and fetch those resources while being busy with downloading other items from your main site.
After you’ve moved your static resources to a separate domain, you’ll usually also end up using less bandwidth. Since you’re now delivering the most requested content from another host, cookies will not be included in the request from the browser. When a browser makes a request for a resource on a certain host, it includes all the cookies that have been set for that domain. This happens independent of which files it’s requesting, and if you have a large number of separate files (which you probably could include into one larger file – resulting in fewer HTTP requests), these Cookie-headers can add up to a significant amount of bandwidth. The HTTP server will also have less work to do, making everyone happier!
If you use www. as a prefix for all your regular HTTP requests and take care of setting your cookies in the www.example.com domain, you should be able to simply use something like static.example.com for your static content and avoid leaking cookies into the other subdomain. If you have loads of static content, you can also use several separate subdomains for your files, but be sure to let the request for a certain file point to the same subdomain each time – otherwise you’ll end up with the browser requesting four copies of the same, identical file and actually breaking the regular cache in the browser (which uses If-Modified-Since to tell the server when it last downloaded the file. We want to avoid the browser making the request again at all). At pwned.no I calculate the crc32 of the filename and use that value to determine which static host the request should use. We also redirect any requests directly to pwned.no to www.pwned.no to make the cookie structure consistent. We do however not set the Expires-header yet, but that might be a part of the next update to the site.
Do you have a particular caching strategy you use for client side content? What kind of URL format works best for you? Leave a comment!
Read all the articles in the Ready for 2010-series
Tags: client side caching, digital chores, expires, headers, http, pwned, Ready for 2010
Posted in Articles, html, PHP, Programming, Ready for 2010, Scalability | No Comments »