svn: OPTIONS of ‘‘: SSL handshake failed: SSL error: A TLS warning alert has been received. ()

November 9th, 2012

If your svn client suddenly starts complaining about something similar to

svn: OPTIONS of '...': SSL handshake failed: SSL error: A TLS warning alert has been received. (...)

The reason might be that the host in the URL (https://example.com/ => example.com) doesn’t match the ServerName setting in the SSL host for your web server. You might not have configured this, so for Apache add:

ServerName example.com

.. and restart the server. It might just work again!

Solr: Replication not starting?

July 20th, 2011

After upgrading our Solr-servers from 1.4.1 to 4.0-trunk (to be sure we were ready for the next version), I had trouble with getting replication to start again. It worked perfectly back with 1.4.1, but after upgrading to 4.0-trunk, it simply wouldn’t start.

I had to upgrade the machines individually (to allow the current index to continue serve requests), I removed the replication and then directed all the traffic to the slave. After updating the master (which worked after actually remembering to clean out the old webapps from Tomcat and adding a few new settings) and reindexing, most of the traffic were directed to it, and the slave were upgraded to the new Solr-version. I turned on replication again, updated the configuration file with the needed settings and started the slave. Nothing happened. Weird.

Time to debug!

On any slaves there’s a “replication.properties” file in the data directory ($SOLRHOME/data) which contain information about the current replication status. This file were created, indicating that at least the replication was attempting to run. If you open the file in a text editor (or just cat it), you should be able to read a bit of meta information about the replication state.

replicationFailedAtList=1311072270004,1311072240006..
timesFailed=11

Seems like it’s trying, but for some reason it doesn’t work. First thing to check would be to grep for replication in the log on both the master and the slave, and see if there’s any requests being made at all. There might be, but the replication still doesn’t start.

Try fetching the current state yourself to see what response the master is serving. You can do this by using “GET” or “wget” or “curl” to make an HTTP request to the master Solr-server from the slave together with the URL from “masterUrl” in the requestHandler for /replication from solrconfig.xml:

  1. GET http://example.com/solr/replication?command=indexversion

This should respond with something close to:

  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <response>
  3.   <lst name="responseHeader">
  4.     <int name="status">0</int>
  5.     <int name="QTime">0</int>
  6.   </lst>
  7.   <long name="indexversion">1310994445934</long>
  8.   <long name="generation">2</long>
  9. </response>

If “indexversion” is 0, this means that the master hasn’t triggered a replication yet, which may seem weird if you’ve just started the server and the slave doesn’t have any data at all.

The reason might be that the master has not been instructed to actually trigger a replication event (and unless a replication event has been triggered, the indexversion will be 0):

  1. <requestHandler name="/replication" class="solr.ReplicationHandler">
  2.   <lst name="master">
  3.     <str name="replicateAfter">commit</str>
  4.     <str name="replicateAfter">startup</str>
  5.     <str name="replicateAfter">optimize</str>

If you only have “commit” in the above list, a replication event will not be triggered unless you’ve actually performed a commit after the slave has connected for the first time. If you add “startup”, the replication will also be triggered when the master starts up (so that any connecting slaves will start replicating right away).

To fix the issue without restarting any nodes, issue a single commit to the master and watch as the slaves start replicating. To issue a commit through curl:

  1. curl http://example.com/solr/update -H "Content-Type: text/xml" –data-binary '<commit />'

nginx and rewriting based on GET-parameter (URL-parameters/arguments)

July 12th, 2011

When rewriting URLs in Apache through mod_rewrite, you have the possibility of using RewriteCond to only apply rewrites if the original resource has been called with a particular argument in the URL (such as “/file?oid=..”).

The solution in nginx was however a bit different, but thanks to Rewriting URL-params in nginx I got on the right track from the start.

In nginx this information is available through the $args variable, which will contain the complete query string. In Will’s example above he’ll replace the query string, but I were interested in inserting a specific parameter instead (and include the previous query string, so I couldn’t just do the “set $args ..” that he does in the example).

My first try was to simply use $1 in the rewrite destination, but this didn’t work – as rewrite will reset the captured patterns from the previous regular expression (since the rewrite source also is a regular expression). But by introducing my own, temporary variable I were able to save the value from the matching regular expression (for the GET parameter) and use it in my rewrite destination.

The following example shows how I ended up solving the issue. This will rewrite the URL only if the “oid” parameter is found at the beginning of the query string when the URL is requested, and the location = /oldURL limits the rewrite to requests for the old resource.

location = /oldURL {
    if ($args ~ "^oid=(\d+)") {
        set $key1 $1;
        rewrite ^.*$  /newURL?param1=foo&param2=bar&key1=$key1 last;
    }
}

This will rewrite a request for /oldURL?oid=123&what=cheese to /newURL?param1=foo&param2=bar&key1=123&oid=123&what=cheese — if you want to exclude the previous arguments, you can either just set $args directly to key1=$1 and just use param1=foo and param2=bar in the rewrite destination:

        set $args key1=$1;
        rewrite ^.*$  /newURL?param1=foo&param2=bar last;

This might be cleaner, depending on what you’re trying to do.

mod_jk and Internal Server Error (HTTP 500)

January 4th, 2011

We’ve extended our previously single Solr-node to a few nodes in a cluster. This allows us to run queries against one node while updating or configuring another, distributing the load across several servers (although we’re not there yet load wise) and being able to handle any out of memory or other critical errors.

While Solr supports querying several cores or distributing the queries internally, we decided to move the load balancing and handling of failed nodes higher up in the hierarchy. We’re now doing simple load balancing and handling of failed nodes by using mod_jk in our existing Apache-based environment. mod_jk also handles failed servers without any administrator interaction. We were already using mod_jk for our main web frontend, and since we use Tomcat as our application container for Solr, things should be a breeze!

Well, no. After copying our existing mod_jk setup, configuring our new workers and restarting Apache, all I got was the well known 500 INTERNAL SERVER ERROR. Here’s the worker configuration file:

worker.list=loadbalancer,status

worker.solr1.port=8009
worker.solr1.host=10.0.0.4
worker.solr1.type=ajp13
worker.solr1.lbfactor=1
worker.solr1.cachesize=10

worker.solr2.port=8009
worker.solr2.host=10.0.0.5
worker.solr2.type=ajp13
worker.solr2.lbfactor=4
worker.solr2.cachesize=10

worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=solr1,solr2
worker.loadbalancer.sticky_session=0

worker.status.type=status

This provides us with two solr servers and one status worker (the status worker is responsible for providing a simple web interface for enabling/disabling/seeing the status of the other workers), configured with a 1:4 load balancing (the second server has quite a bit more memory available for Solr).

I provided the configuration of the workers through the JkWorkersFile configuration setting (in a VirtualHost block… don’t do that):

JkWorkersFile conf/workers.properties

I’d also enable debug logging to attempt to find the problem (still in a VirtualHost block):

JkLogFile logs/mod_jk.log
JkLogLevel debug
JkLogStampFormat "[%a %b %d %H:%M:%S %Y]"

Other mod_jk settings (in the VirtualHost block) were:

JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories
JkRequestLogFormat "%w %V %T"
JkShmFile logs/jk.shm
JkMount /* loadbalancer

<Location /jkstatus>
	JkMount status
	Order deny,allow
        Deny from all
        Allow from 127.0.0.1
</Location>

Still no solution. Peeking at the log files mod_jk provided, I were able to deduce the following:

[debug] map_uri_to_worker::jk_uri_worker_map.c (525): Attempting to map context URI '/jkstatus'
[debug] map_uri_to_worker::jk_uri_worker_map.c (550): Found an exact match status -> /jkstatus
[debug] jk_handler::mod_jk.c (1920): Into handler jakarta-servlet worker=status r->proxyreq=0
[debug] wc_get_worker_for_name::jk_worker.c (111): did not find a worker status
[info]  jk_handler::mod_jk.c (2071): Could not find a worker for worker name=status

This indicates that mod_jk was unable to find a worker matching the name I provided in the JkMount statement above; status. Weird. I added some garbage characters to the “JkWorkersFile” setting, and Apache complained that it were unable to find the workers file. Changed it back, reloaded, and still nothing. It was apparently unable to find the worker. The map did however work, as it tried to launch a worker.

Looking back at the start up sequence of mod_jk, the following were found in the log:

[debug] build_worker_map::jk_worker.c (236): creating worker ajp13
[debug] wc_create_worker::jk_worker.c (141): about to create instance ajp13 of ajp13
[debug] wc_create_worker::jk_worker.c (154): about to validate and init ajp13
[debug] ajp_validate::jk_ajp_common.c (1922): worker ajp13 contact is 'localhost:8009'
[debug] ajp_init::jk_ajp_common.c (2047): setting endpoint options:
[debug] ajp_init::jk_ajp_common.c (2050): keepalive:        0
[debug] ajp_init::jk_ajp_common.c (2054): timeout:          -1
[debug] ajp_init::jk_ajp_common.c (2058): buffer size:      0
ajp_init::jk_ajp_common.c (2062): pool timeout:     0
[debug] ajp_init::jk_ajp_common.c (2066): connect timeout:  0
[debug] ajp_init::jk_ajp_common.c (2070): reply timeout:    0
[debug] ajp_init::jk_ajp_common.c (2074): prepost timeout:  0
[debug] ajp_init::jk_ajp_common.c (2078): recovery options: 0
[debug] ajp_init::jk_ajp_common.c (2082): retries:          2
[debug] ajp_init::jk_ajp_common.c (2086): max packet size:  8192
[debug] ajp_create_endpoint_cache::jk_ajp_common.c (1959): setting connection pool size to 1 with min 0

It took a bit of time, but I realized that this tells me that mod_jk created _a default_ worker named ajp13. Apparently it was not reading my worker file at all, but it still complained if I changed the file name. You’d think that the setting which loads the configuration file would work when it complains when it doesn’t. But .. well. After an hour of attempting to find out why the workers didn’t load, revising the workers file to a minimal example, trying with just a single status worker, I concluded that my workers file was correct, and obviously mod_jk found it when it attempted to load it.

Then I suddenly discovered the small notice in the mod_jk configuration manual:

JkWorkersFile: This directive is only allowed once. It must be put into the global part of the configuration.

JkWorkersFile can not be defined in a <VirtualHost> section. It will NOT complain if you do it, it’ll just never define any workers. It will complain if the file doesn’t exist, even if it never tries to actually load it.

Confusing.

Moving the JkWorkersFile statement out from the <VirtualHost> block and to the LoadModule statement instead solved the issue. This is also the case for JkWorkerProperty.

Solr, Tomcat and HTTP/1.1 505 HTTP Version Not Supported

May 19th, 2010

During today’s hacking aboot I came across the above error from our Solr query library. The error indicates that some part of Tomcat was unable to parse the “GET / HTTP/1.1″ string – where it is unable to determine the “HTTP/1.1″ part. A problem like this could be introduced by having a space in the query string (and it not being escaped properly), so that the request would have been for “GET /?a=b c HTTP/1.1″. After running through both the working and non-working query through ngrep and wireshark, this did however not seem to be the problem. My spaces were properly escaped using plus signs (GET /?a=b+c HTTP/1.1).

There does however seem to be a problem (at least with our version of Tomcat – 6.0.20) which results in the +-s being resolved before the request is handed off to the code that attempts to parse the header, so even though it is properly escaped using “+”, it still barfs.

The solution:

Use %20 to escape spaces instead of + signs; simply adding str_replace(” “, “%20″, ..); in our query layer solved the problem.

SOLR: java.io.FileNotFoundException: no segments* file found

January 10th, 2010

While playing around with one of my development SOLR installations (this time under Windows), I suddenly got a weird error message when feeding data to one of the fresh cores.


SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.SimpleFSDirectory@C:\temp\solr\*\data\index: files:

Taking a look at the contents of the index\ directory, it was in fact empty. Seems weird, but my initial guess was that Lucene / SOLR would treat this as a new installation and create the files.

Turns out the issue is that it won’t – as long as the index directory exists, Lucene / SOLR goes looking for the segment files.

Thanks to an old post to the solr-dev list by Yonik, the easiest fix is to simply delete the index directory and restart your applet container (Tomcat in this case).

Porting SOLR Token Filter from Lucene 2.4.x to Lucene 2.9.x

September 25th, 2009

I had trouble getting our current token filter to work after recompiling with the nightly builds of SOLR, which seemed to stem from the recently adopted upgrade to 2.9.0 of Lucene (not released yet, but SOLR nightly is bleeding edge!). There’s functionality added for backwards compability, and while that might have worked, things didn’t really come together as it should (somewhere or the other). So I decided to port our filter over to the new model, where incrementToken() is the New Way ™ of doing stuff. Helped by the current lowercase filter in the SVN trunk of Lucene, I made it all the way through.

Our old code:

  1.     public NorwegianNameFilter(TokenStream input)
  2.     {
  3.         super(input);
  4.     }
  5.  
  6.     public Token next() throws IOException
  7.     {
  8.         return parseToken(this.input.next());
  9.     }
  10.  
  11.     public Token next(Token result) throws IOException
  12.     {
  13.         return parseToken(this.input.next());
  14.     }

Compiling this with Lucene 2.9.0 gave me a new warning:

Note: .. uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

To the internet mobile!

Turns out next() and next(Token) has been deprecated in the new TokenStream implementation, and the New True Way is to use the incrementToken() method instead.

Our new code:

  1.     private TermAttribute termAtt;
  2.  
  3.     public NorwegianNameFilter(TokenStream input)
  4.     {
  5.         super(input);
  6.         termAtt = (TermAttribute) addAttribute(TermAttribute.class);
  7.     }
  8.  
  9.     public boolean incrementToken() throws IOException
  10.     {
  11.         if (this.input.incrementToken())
  12.         {
  13.             termAtt.setTermLength(this.parseBuffer(termAtt.termBuffer(), termAtt.termLength()));
  14.             return true;
  15.         }
  16.        
  17.         return false;
  18.     }

A few gotcha’s along the way: incrementToken needs to be called on the input token string, not on the filter (super.incrementToken() will give you a stack overflow). This moves the token stream one step forward. We also decided to move the buffer handling into the parse token function to handle this, and remember to include the length of the “live” part of the buffer (the buffer will be larger, but only the content up to termLength will be valid).

The return value from our parseBuffer function is the actual amount of usable data in the buffer after we’ve had our way with it. The concept is to modify the buffer in place, so that we avoid allocating or deallocating memory.

Hopefully this will help other people with the same problem!

Transparent Remapping of / For Struts Actions

October 17th, 2008

I’ve been trying to find a solution to this issue for a couple of hours: We have several struts actions in our Java-based webapp, all neatly mapped through different .do action handlers. I wanted to switch our handling of the root / index URL (http://www.example.com/) from being a static redirect to the actual action to instead present the content right there, without the useless redirect. This apparently proved to be harder than I thought; after searching a lot through Google, reading through mailing lists and the official documentation, it seems that there is no way to specify an action to handle these requests by default in struts. There may be one, but I could not for the life of ${deity} find it.

As simply doing it in Struts were out of the question, I turned to an old friend of mine, the ever so helpful mod_rewrite. mod_rewrite is capable of rewriting URLs internally in Apache before they get handled at other levels. The problem was that mod_jk seemed to grab the request before the replacements were made, but a few resources pointed me in the correct direction:

After a bit of debugging with the rewritelog, everything came together. This is how it ended up:

       	RewriteEngine On
       	RewriteLog /tmp/rewrite.log
       	RewriteLogLevel 3
       	RewriteRule ^/?$ /destinationfile [PT,NE]

The RewriteLog say that we should log the rewrite progress to /tmp/rewrite.log, and RewriteLogLevel say that we should log at the most detailed format. Use 0, 1, 2 for less debugging information. Remember to comment out these lines when things work. DO NOT use the RewriteLog when not actually debugging the rewrites.

Using Apache httpd as Your Caching Solution

June 26th, 2008

In this article I’m going to describe a novel solution for making cached versions of dynamic content available, while attempting to strike a balance between flexibility, performance and the origin of dynamic content. This solution may not be suited for very dynamic content (where the updates are better triggered by rewriting the cached version when the content changes), but in those situations where the dynamic content may be built from a very large dataset on request from the users. I have two use cases detailing applications I’ve been involved in building where I have applied this strategy. This could also be implemented with a caching service in front of the main service, but will require the installation of a custom service and hardware etc. for that service.

The WMS Cache

WMS (Web Map Service) is an OGC (Open Geospatial Consortium) specification which details a common set of parameters for how to query a web service which returns a raster map image (a regular png/jpg/bmp file) for an area. The parameters include the bounding box (left,bottom,right,upper) and the layers (roads,rivers,etc) and the size of the resulting image. The usual approach is to add a caching layer in the WMS itself, so any generated image is simply stored to disk, and then checked if the disk exists before retrieve the data and rendering the image (and if it exists, just return the image data from disk instead). This will increase the rate of requests the WMS can answer and will take load off the server for the most common requests. We are still left with the overhead of parsing the request, checking for the cached file and most notably, loading our dynamic language of choice and responding to the request. An example of such a small and naive PHP application is included:

  1. <?php
  2. $parameters = $_GET;
  3. ksort($parameters);
  4. $cacheIdentifier = md5(serialize($parameters));
  5.  
  6. if (file_exists('cache/' . $cacheIdentifier))
  7. {
  8.     print(file_get_contents('cache/' . $cacheIdentifier));
  9.     exit();
  10. }
  11.  
  12. $data = '';
  13. /* Do stuff to generate data */
  14. file_put_contents('cache/' . $cacheIdentifier, $data);
  15. print($data);
  16. exit();

The next request which arrives with the identical set of GET-parameters, will be served with the overhead of loading PHP, parsing the PHP-script (which is less if you have APC or a similar cache installed), sorting the GET-parameters (so that bbox=..&x=.. is the same as x=..&bbox=..), serializing the response, checking that the file exists on disk (you could simplify this to just doing a read and checking if the read succeeded), copying the data from disk to memory and then outputting the data to the client (you could also use fpassthru() and friends which may be more optimized for simple reading and output of data, but that’s not the main point here).

To relate this to our use case of the WMS, we need to take a closer look at how map services are used today. Before Google showed the world what a good map solution could look like with modern web technology, a map application presented an image to the user, allowed the user to click or drag the image to zoom or move, and then reloaded the entire page to generate the new image. If it took 0.5s to generate the image, that were not really a problem, as the data set is not updated very often and it is very easy to do these operations in parallel across a cluster. When Google introduced Google Maps, they loaded 9 visible images (tiles) in the first image, and then started loading other tiles in the background (so that when you scroll the map, it looks like the images are already in place). If you run an interface similar to Google Maps against a regular WMS, most WMS servers would explode and take the whole 42U rack with them. Not a very desirable situation. The easy solution if you have an unlimited set of resources, disk space and money is to simply generate all the available tiles up front, in the same way as Google has done it. This will require disk space for all the tiles, and will not allow your users to choose which layers then want included in the map (this will change as map services are starting to build each layer as a separate tile and then superimposing them in the user interface).

The problem is that most of us (actually, n – 1) are not Google, but most of us do not build map services either. For those of us who do, we needed a way of living somewhere in between of having to render our complete dataset to image tiles up front or running everything through the WMS. While working with Gunnar Misund at Østfold University College, I designed a simple scheme to allow compatible clients to fetch cached tiles automagically, while those tiles which did not exist yet, were generated on the fly from the background WMS. The idea was to let Apache httpd handle the delivery of already generated and cached content, while our WMS could serve those areas which were viewed for the very first time (or where the layer selection were new). It would not be as fast as Google Maps for non-cached content, but it wouldn’t require us to run through our complete service to generate images either.

The solution was to let the javascript client request images through a custom URL:

http://example.com/300/400/10/59.205278/10.95/rivers,roads/image.jpg

(This is just an example, and does only contain the center point of the image). This is decomposed into:

http://example.com/x_width/y_height/zoomlevel/centerlat/centerlon/layers/image.fileformat

This is all good as long as image.jpg exists in the local path provided, so that Apache can just serve the image as it is from the location. Apache httpd (or lighttpd and other “serve files fast!”-httpds) are able to serve these static files in large numbers (it’s what they were written for, you know..) with a minimum overhead. The problem is what to do when the file actually does not exist, which will happen each time a resource is requested for the first time, and we do not have a cache yet. The solution lies in assigning a PHP-file as the handler for any 404 error (file not found). This is a well known trick used all over the field (such as handling www.php.net/functionname direct lookup). In PHP you can use $_SERVER['REQUEST_URI'] to get the complete path of the request that ended in the 404.

The .htaccess file of the application is as simple as cake:

ErrorDocument 404 /wms/handler.php

I’ve enclosed a simple specification which were written as a description of the implementation when the project was done in 2005.

Thumbnail generation

Generating thumbnails can also be transformed into the same problem set. In the case where you need several different sizes of thumbnails (and different rescales are needed for different applications), you can apply the same strategy. Instead of handing all the information to a resize script with the file name etc. as the argument, simply have the xsize and the ysize as part of the URL. If the file exists in the path, it’s served directly with no overhead, otherwise the 404 handler is invoked as in the previous example. The thumbnail can then generated, saved in the proper location and the world can continue to rotate at it’s regular pace.

This application can then be extended by adding new parameters in the url, such as the resize method, if the image should be stretched, zoomed and other options.

Conclusions

This is a very simple scheme that does not require any custom hardware or server software installed, and places itself neatly in between having a caching front end server between the client and the application and the hassle of generating the same file each and every time. It allows you to remove the overhead of invoking the script (PHP in this case) for each request, which means that you can serve files at a much greater rate and let your hardware do other, more interesting things instead.