Ilya Grigorik has posted a very good summary of a talk that Brian Aker and Alan Kasindorf gave about memcached at MySQL User Conference last week. The article is straight to the point in regards to several key attributes about memcached, and serves up almost 30 direct tips and tidbits about how to use memcached in a more optimal way. Awesome reading, and well worth to check out together with the slides from the memcached talk.
PHP Vikinger Registration Up and Running
This year’s version of the unconference PHP Vikinger is taking place 21st of June in Skien, Norway. Derick has just opened up the registration which involves using high tech methods such as an E-mail-client and writing your name and other relevant information. One thing’s for sure, I, Eirik and Christer are heading out, and hopefully we’ll get a few more friends to join in.. and that includes YOU!
PHP and Google’s Summer of Code
This year’s PHP-related projects for Google’s Summer of Code was announced today. Interesting projects include the unicode-implementation for PHP6, improvements to PECL, optimizations, Xdebug and several others. In short words: yay!
I especially find the project for creating a LLVM-extension for Zend interesting.
Canon EOS 5D Mark II Coming?
Wired’s Gadget Lab has noted a neat scoop in regards to the long awaited upgraded version of the Canon EOS 5D! According to the information that supposedly were put online by the german division of Canon, the upgraded full frame camera gets a Digic III processor, a total of 16 megapixels and 6.5 fps shooting speed. I’m already drooling enough to make a small puddle.
Earlier rumors has indicated a price somewhere around $3299 ($3000 – $3500). An announcement from Canon is to be released at friday, but it’s not known wether that will have anything to do with a Canon EOS 5D Mark II.
New Week, New Book: Software Estimation: Demystifying the Black Art
It’s a new week and as I finished my previous “to read while taking the train” book last week, I’ve now started on another well received book, this time about software estimates. The book is published by Microsoft Press and is named Software Estimation: Demystifying the Black Art. The author, Steve McConnel also have a webpage online, in addition to keeping track of the blog 10x Software Development. It’s been a good read so far, and considering the current exchange rate between norwegian kroners and the us dollar, it’s a steal at $26.39 at Amazon.
Stuart Herbert Takes a Look at apache2-mpm-itk
Stuart Herbert has taken a closer look at apache2-mpm-itk , a patch for the apache2 prefork handler to enable Apache to switch which user it runs under based on which VirtualHost that serves the request. The author of the module, Steinar H. Gunderson is a good friend of mine, and it’s always good to see familiar names getting attention for things they’re writing.
The post from Stuart is pretty straight forward, but he fails to mention that apache2-mpm-itk is available as a regular package in all current debian versions . Simply apt-get away, and you’re all set.
Solr: Deleting Multiple Documents with One Request
One of the finals steps in my current Solr adventure was to make it possible to remove a large number of documents form the index at the same time. As we’re currently using Solr to store phone information, we may have to remove several thousand records in one large update. The examples on the Solr Wiki shows how to remove one single document by posting a simple XML-document, or remove something by query. I would rather avoid beating our solr server with 300k of single delete requests, so I tried the obvious tactics with submitting several id’s in one document, making several <delete>-elements in one document etc, but nothing worked as I wanted it to.
After a bit of searching and stumbling around with Google, I finally found this very useful tip from Erik Hatcher. The clue is to simply rewrite the delete request as a delete by query, and then submit all the id’s to be removed as a simple OR query. On our development machine, Solr removed 1000 documents in somewhere around 900ms. Needless to say, that’s more than fast enough and solved our problem.
To sum it up; write a delete-by-query-statement as:
id:(123123 OR 13371337 OR 42424242 .. )
Thanks intarwebs!
Using Solrj – A short guide to getting started with Solrj
As Solrj – The Java Interface for Solr – is slated for being released together with Solr 1.3, it’s time to take a closer look! Solrj is the preferred, easiest way of talking to a Solr server from Java (unless you’re using Embedded Solr). This way you get everything in a neat little package, and can avoid parsing and working with XML etc directly. Everything is tucked neatly away under a few classes, and since the web generally lacks a good example of how to use SolrJ, I’m going to share a small class I wrote for testing the data we were indexing at work. As Solr 1.2 is the currently most recent version available at apache.org, you’ll have to take a look at the Apache Solr Nightly Builds website and download the latest version. The documentation is also contained in the archive, so if you’re going to do any serious solrj development, this is the place to do it.
Oh well, enough of that, let’s cut to the chase. We start by creating a CommonsHttpSolrServer instance, which we provide with the URL of our Solr server as the only argument in the constructor. You may also provide your own parsers, but I’ll leave that for those who need it. I don’t. By default your Solr-installation is running on port 8080 and under the solr directory, but you’ll have to accomodate your own setup here. I’ve included the complete source file for download.
class SolrjTest
{
public void query(String q)
{
CommonsHttpSolrServer server = null;
try
{
server = new CommonsHttpSolrServer("http://localhost:8080/solr/");
}
catch(Exception e)
{
e.printStackTrace();
}
The next thing we’re going to do is to actually create the query we’re about to ask the Solr server about, and this means building a SolrQuery object. We simply instanciate the object and then start to set the query values to what we’re looking for. The setQueryType call can be dropped to use the default QueryType-handler, but as we currently use dismax, this is what I’ve used here. You can then also turn on Facet-ing (to create navigators/facets) and add the fields you want for those.
SolrQuery query = new SolrQuery();
query.setQuery(q);
query.setQueryType("dismax");
query.setFacet(true);
query.addFacetField("firstname");
query.addFacetField("lastname");
query.setFacetMinCount(2);
query.setIncludeScore(true);
Then we simply query the server by calling server.query, which takes our parameters, build the query URL, sends it to the server and parses the response for us.
try
{
QueryResponse qr = server.query(query);
This result can then be fetched by calling .getResults(); on the QueryResponse object; qr.
SolrDocumentList sdl = qr.getResults();
We then output the information fetched in the query. You can change this to print all fields or other stuff, but as this is a simple application for searching a database of names, we just collect the first and last name of each entry and print them out. Before we do that, we print a small header containing information about the query, such as the number of elements found and which element we started on.
System.out.println("Found: " + sdl.getNumFound());
System.out.println("Start: " + sdl.getStart());
System.out.println("Max Score: " + sdl.getMaxScore());
System.out.println("--------------------------------");
ArrayList> hitsOnPage = new ArrayList>();
for(SolrDocument d : sdl)
{
HashMap values = new HashMap();
for(Iterator> i = d.iterator(); i.hasNext(); )
Map.Entry e2 = i.next();
values.put(e2.getKey(), e2.getValue());
}
hitsOnPage.add(values);
System.out.println(values.get("displayname") + " (" + values.get("displayphone") + ")");
}
After this we output the facets and their information, just so you can see how you’d go about fetching this information from Solr too:
List facets = qr.getFacetFields();
for(FacetField facet : facets)
{
List facetEntries = facet.getValues();
for(FacetField.Count fcount : facetEntries)
{
System.out.println(fcount.getName() + ": " + fcount.getCount());
}
}
}
catch (SolrServerException e)
{
e.printStackTrace();
}
}
public static void main(String[] args)
{
SolrjTest solrj = new SolrjTest();
solrj.query(args[0]);
}
}
And there you have it, a very simple application to just test the interface against Solr. You’ll need to add the jar-files from the lib/-directory in the solrj archive (and from the solr library itself) to compile and run the example.
Download: SolrTest.java
Solrj and JSTL EL: java.lang.NumberFormatException
While working with a view of a collection of documents returned from Solr using Solrj earlier today, I was attempting to write out the number of documents found in the search. In pure Java code you’d just request this by just calling .getNumFound() on the SolrDocumentList containing your documents, which whould also mean that they should be available through EL in JSTL by calling ${solrDocumentList.numFound} (which in turn calls getNumFound() in the SolrDocumentList object). The code in question was as simple as:
Which resulted in this error message, which kind of came as a surprise:
java.lang.NumberFormatException: For input string: "numFound" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:447) at java.lang.Integer.parseInt(Integer.java:497)
After digging around a bit and reading the error message yet again, it suddenly hit me: $solrDocumentList was being interpreted and casted to a List, and as such, EL expected an index into the List instead of my call to a function. I’ve not been working with JSTL for too long, so I thought a bit about how to solve this. One solution would be to do the calls in the Action and then just map them to separate variables in the template, but this wasn’t really as pretty as it could be. Instead I wrote a simple wrapper around the SolrDocumentList, which is not a list in itself, but exposes all the elements through it’s getDocumentList-method. That way we can access it in the template by calling ${solrDocumentList.documentList…}.
I’ve included the simple, simple wrapper here. It should be expanded with access to Facet fields etc, but this should be a simple indicator of my suggested solution.
public class SolrSearchResult
{
SolrDocumentList resultDocuments = null;
public SolrSearchResult(SolrDocumentList results)
{
this.resultDocuments = results;
}
public long getNumFound()
{
return this.resultDocuments.getNumFound();
}
public long getStart()
{
return this.resultDocuments.getStart();
}
public float getMaxScore()
{
return this.resultDocuments.getMaxScore();
}
public SolrDocumentList getDocumentList()
{
return this.resultDocuments;
}
public void setDocumentList(SolrDocumentList results)
{
this.resultDocuments = results;
}
}
Any comments and updates are of course as always welcome.
Writing a Custom Validator for Zend_Form_Element
My good friend Christer has written a simple tutorial on how to write a custom validator for a Zend_Form_Element. If you’ve ever laid your hands on Zend_Form, you’ll want to have a look at this for a short and concise introduction to the topic. He’ll show you how to create a “repeat the password”-field by creating a custom validator and hooking it onto the original password field. Neat stuff.