cache – Mats Lindh

I’ve spent some time setting up varnish for a service at work where we deliver JSON-fragments from different resources as a unified document. We build the resulting JSON document by including the different JSON fragments through an ESI (Edge Side Includes) statement in the response. Varnish then fetches these fragments from the different services, allowing for independent caching duration for each of the services, combining then into a complete document.

The main “problem” is that of Varnish 2.0.4, varnish does a check to see if the content returned from the server actually is HTML-based content. It does this by check if the first character of the returned document is ‘<'. The goal is to avoid including binary data directly into a text document. Since we’re working with JSON instead of regular HTML/XML, the first character in our responses is not a ‘<', and the ESI statement did not get evaluated. You can however disable this check in varnishd by changing the esi_syntax configuration setting - just add the preferred value to the arguments when starting varnish. To disable the content check, add:

varnishd … -p esi_syntax=0x1

I’m not sure if you can set this for just one handler / match in varnish as this solved our issue, but feel free to leave a comment if you have any experience with this.

If you’re still looking for a good reason to spend a few minutes tuning your SOLR caches (documentCache, filterCache and queryResultCache), I’ll give you two numbers:

avgTimePerRequest : 126.148822
avgTimePerRequest : 70.026436

The first is with the default cache settings, the latter is with a very small change. Yep. That’s a 45% speed increase. So, the interesting question is what Iactually changed in the cache configuration – although I should warn you, the answer is very, very, very complicated:

The cache size. The default size (at least for our current 1.3 installation) is to keep 512 elements in the cache. When someone on the solr-user list asked for an introduction to what the different cache statistics meant, I remembered that I hadn’t actually tweaked the settings at all. The SOLR server has been running for a year now, so we now have a quite good idea of how it will perform and what kind of queries we are seeing. The stats indicated that a lot more cached entries got evicted than what I were hoping to see, and this gave us a lower cache hit rate (about 50%).

The simple change was to increase the size of the cache (from 512 to 16384), so that we’re able to keep more documents in memory before evicting them. After running 24 hours with the new setup we’re now seeing cache hits as 99%, 68% and 67%. The relevant sections of the solrconfig.xml file are:

The document cache fills about 4 times as fast as the filter cache, so we might have to tweak the settings further by suiting it even better to our load pattern.

So what now?

The next step would be to try to change to the FastLRUCache which is included with Solr 1.4 (currently in SVN and nightlies). If my memory serves me right the changes are mostly related to locking, so I’m not sure if we’ll see any significant better performance.

We’ll also make further adjustments to the size of each of the caches to better match our usage.

Tag: cache

Varnish: No ESI processing, first char not ‘<‘

How To Make Solr Go 45% Faster

So what now?