Supporting 2-pass Parallel Encoding with x264 and ffmpeg

If you’re doing several encodes of a single input file (to encode several different sizes / bitrate combinations) in parallel with x264, you’re going to have a problem. The first pass will create three files with information about the file for the second pass, and you’re unable to change this file name into something better. This seems to be a problem for quite a lot of people according to a Google-search for the issue, and none seems to have any proper solution.

I have one. Well, probably not a proper solution, but at least it works! The trick is to realize that ffmpeg/x264 creates these files in the current working directory. To run several encodings in parallel, you’ll simply have to give each encoding process it’s own directory, and then use absolute paths to the source and destination file (and any other paths). Let it create the files there and clean up and delete the directories afterwards.

I’ve included some example code from PHP in regards to how you could solve something like this. I simply use the output file name as the directory name here, and create the directory in the system temp directory.

$tempDir = sys_get_temp_dir() . '/' . $outputFilename);
mkdir($tempDir, 0700, true);
chdir($tempDir);

After doing the encode, we’ll have to clean up. The three files that ffmpeg/x264 creates are ffmpeg2pass-0.log, x264_2pass.log and x264_2pass.log.mbtree.

unlink($tempDir . '/ffmpeg2pass-0.log');
unlink($tempDir . '/x264_2pass.log');
unlink($tempDir . '/x264_2pass.log.mbtree');
rmdir($tempDir);

And that should hopefully solve it!

Patching The PHP Gearman Extension

Apollo 10 Capsule

Update: it seems that this behaviour in libgearman changed from 0.8 to 0.10, and according to Eric Day (eday), the behaviour will change back to the old one with 0.11.

After upgrading to the most recent version of the Gearman-extension from PHP and libgearman, Suhosin started complaining about a heap overwrite problem. The error only popped up on certain response sizes, which made me guess that it could be a buffer overrun or something strange going on in the code handling the response.

Seeing this as an excellent opportunity to get more familiar with the Gearman code, I dug into the whole shebang yesterday and continued my quest for cleansing today. After quite a few hours of getting to know the code and attempting to understand the general flow, I was finally able to find – and fix – the problem.

The first symptom of the issue was that the Gearman extension at certain times failed to return the complete response from the gearman server. I created a small application that returned responses of different sizes, showing that the problem was all over the place. While n worked, n+1 returned only n bytes, and n+2 resulted in a heap overflow.

The issue was caused by an invalid efree, where the code in question was:

void _php_task_free(gearman_task_st *task, void *context) {
	gearman_task_obj *obj= (gearman_task_obj *)context;
    TSRMLS_FETCH();

	if (obj->flags & GEARMAN_TASK_OBJ_DEAD) {
		GEARMAN_ZVAL_DONE(obj->zdata)
		GEARMAN_ZVAL_DONE(obj->zworkload)
		efree(obj);
	}
	else 
	  obj->flags&= ~GEARMAN_TASK_OBJ_CREATED;
}

This seems innocent enough, and I really had trouble seeing how this could lead to the observed behaviour. This meant going for a wild goose chase around the Gearman code, trying to piece together how things worked. And after a few proper debug rounds, I finally discovered the issue: the context variable was not a gearman_task_obj struct under certain criteria. The gearman_task_obj struct is allocated by php_gearman and then assigned to the task in question. This makes it possible for the extension to tag an internal structure together with the task in libgearman. Under certain conditions this struct is not created, and by default, libgearman assigns the client struct to the context instead (this is also available as task->client). So instead of the gearman_task_obj that was assumed to be present, we actually got a gearman_client struct.

That provides a reason why things went sour, but why exactly did I see the behaviour I saw? Well, to answer that, we’ll have to take a look at the actual contents of the struct. The client struct contains a value keeping the number of bytes in the response, while the task_obj struct keeps the flags (which is what the code above checks and updates). Coincidentally these two int values are aligned similiar in the two structs – resulting in the number of bytes in the response being used as the flags value. This value is then modified (under certain conditions) or results in a free using other offsets into the struct. The call to efree() will then use some random values (or, more specific, the values that lines up with the location in task_obj) when attempting to do the free, resulting in a corruption. Suhosin caught it, while it would probably have generated a few weird bugs (where the last byte would go missing) under an unprotected PHP installation. +1 for Suhosin!

The patch for php_gearman.c is available, and should be applied towards 0.6.0. Although I’ve had a few looks around, it might introduce a memory leak. People who know the code way better than I do will probably commit a better patch, and the issue will be fixed in 0.7.0 of the extension.