SQLAlchemy, MySQL and UTF-8

While SQLAlchemy uses UTF-8 by default, the charset used when communicating with MySQL will affect the encoding of the returned data. To be sure that everything is handled properly as UTF-8 (which you might use SET NAMES 'utf8' in the console (don’t do that here..)), add ?charset=utf8 to your connection url:

mysql://user:password@localhost/database?charset=utf8

Thanks to RustyFluff at StackOverflow.

ERROR 1045 (28000): Access denied for user ‘debian-sys-maint’@’localhost’ (using password: YES)

Another error message from the attempt to move an existing MySQL installation to a new server, this time caused by the import of all the old users from the old server. debian (and Ubuntu) adds their own user to the server to be able to perform upgrades and other maintenance through automagical scripts. The password that debian (or Ubuntu) uses to connect to this user is stored in a file on the local file system: /etc/mysql/debian.cnf.

This file contains the current password for the debian-sys-maint user, and if you’re moving all your users to a new server, be sure to also get this password from the file on the old server and copy it into the file on the new server. I strongly suggest copying just the password and not the whole file, as there may be new settings added to the file since you last upgraded the old server.

ERROR 1577 (HY000) at line 1: Cannot proceed because system tables used by Event Scheduler were found damaged at server start

After moving a complete SQL dump from a MySQL 5.0 server to a MySQL 5.1 server I forgot about any issues stemming from the different versions of MySQL. Importing the old information_schema tables and any possible changes for the mysql table itself made it barf out the error message “ERROR 1577 (HY000) at line 1: Cannot proceed because system tables used by Event Scheduler were found damaged at server start” when trying to connect to the server.

The connection were still made, but obviously something were afoot. A few pages hinted in the direction of using mysql_upgrade to upgrade the now 5.0-compatible tables to the 5.1 format, but mysql_upgrade didn’t seem to want to run (maybe because of a new root password / debian-sys-maint password). The solution was to tell mysql_upgrade to ask for a password, which user it should connect as and which server it should connect to. You should run this command as root (preferrably through sudo) so that it can run mysql_fix_privilege_tables with the correct permissions:

sudo mysql_upgrade -u root -h localhost -p --verbose --force

The --force parameter will make the command run even if it thinks your server already has been upgraded to 5.1 (or the current version if you’re reading this at a later time). It may think this if you’ve already performed parts of the upgrade, but failed when trying to do the mysql_fix_privilege_tables-bit.
And away we went!

dpkg –configure Error for mysql-server-5.1

After starting the year by upgrading my Ubuntu desktop at work to 5.1 (and spending a couple of hours after borking my new kernel), my mysql-server had become lost in the transition.

Reinstalling the mysql-server package gave a cryptic error message:

Setting up mysql-server-5.1 (5.1.37-1ubuntu5) ...
Installing new version of config file /etc/init.d/mysql ...
 * Stopping MySQL database server mysqld                                 [ OK ] 
Warning: found usr.sbin.mysqld in /etc/apparmor.d/force-complain, forcing complain mode
 * Starting MySQL database server mysqld                                 [fail] 
invoke-rc.d: initscript mysql, action "start" failed.
dpkg: error processing mysql-server-5.1 (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of mysql-server:
 mysql-server depends on mysql-server-5.1; however:
  Package mysql-server-5.1 is not configured yet.
dpkg: error processing mysql-server (--configure):
 dependency problems - leaving unconfigured
Processing triggers for menu ...
No apport report written because the error message indicates its a followup error from a previous failure.
                          Errors were encountered while processing:
 mysql-server-5.1
 mysql-server
E: Sub-process /usr/bin/dpkg returned an error code (1)

Quite weird, yes. Trying to find anything in the mysql-logs didn’t yield anything either (both were empty), but luckily the syslog provided an answer this time. Among loads of other messages regarding mysql the answer hid from general view:

Jan  7 14:30:48 ubuntu mysqld_safe[20249]: 100107 14:30:48 [ERROR] /usr/sbin/mysqld: unknown option '--skip-bdb'
Jan  7 14:30:48 ubuntu mysqld_safe[20249]: 100107 14:30:48 [ERROR] Aborting

Taking a look at the mysql config file in /etc/mysql/my.cnf confirmed this:

#
# * BerkeleyDB
#
# Using BerkeleyDB is now discouraged as its support will cease in 5.1.12.
skip-bdb

Apparently not just the support ceased, but also the skip-bdb parameter. That seems a bit odd – I’d suggest at least keeping the parameter around for another release cycle, but anyhow:

#
# * BerkeleyDB
#
# Using BerkeleyDB is now discouraged as its support will cease in 5.1.12.
# skip-bdb

And suddenly everything went in the right general direction!

Ready for 2010: Check Your Indexes

One of the many things you should try to keep a continuous watch for during the life of any of your applications are the performance of your SQL queries. You might be caching the hell out of your database layer, but some time you’ll have to hit the database server to retrieve data. And if that starts to happen often enough while you’re growing, you will see your SQL daemon taking up the largest part of your disk io and your CPU time. This might not be a problem for the load you’re seeing now, but could you handle a 10 fold increase in traffic? .. or how about 100x? (which, if I remember correctly, is what Google uses as the scale factor when developing applications)

Indexes Are Your Friend

During the christmas holiday I got around to taking a look at some of the queries running at one of my longest living, most active sites: pwned.no. Pwned is a tournament engine running on top of PHP and MySQL, containing about 40.000 tournaments, 450.000 matches and several other database structures. The site has performed well over the years and there hasn’t been any performance issues other than a few attempts at DoS-ing the site with TCP open requests (the last one during the holiday, actually).

Two weeks ago the server suddenly showed loads well above 30 – while it usually hovers around 0.3 – 0.4 at the busiest hours of the day. The reason? One of the previously less used functions of the tournament engine, using a group stage in your tournament, had suddenly become popular in at least one high traffic tournament. This part of the code had never been used much before, but when the traffic spike happened everything went bananas (B-A-N-A-N-A-S. Now that’s stuck in your head. No problem.) The reason: the query used a couple of columns in a WHERE-statement that wasn’t indexed, and the query ran against the table containing the matches for the tournament. This meant that over 400.000 rows were scanned each time the query ran, meaning that mysqld started hogging every resource it could. The Apache childs then had to wait, making the load a bit too much for my liking. Two CREATE INDEX-calls later the load went back down and everything chugged along nicely again.

My strategy for discovering queries that might need a better index scheme (or if “impossible”, a proper caching layer in front of it):

  1. Run your development server with slow-query-log=1, log-queries-not-using-indexes=1 and long-query-time=<an appropriately low value, such as 0.05 – depends on your setup>. You can also provide a log file name with the log-slow-queries=/var/log/mysql/… in your my.cnf-file for MySQL. This will log all potential queries for optimizing to the log file (this will not necessarily provide you with a complete list of good queries to optimize, but it might provide a few good hints). Be sure to use actual data from your site when working on your development version, as you might start seeing issues when the size of the data set reaches a certain size – such as 400.000 rows in the example mentioned above)
  2. Connect to your MySQL server and issue
    SHOW PROCESSLIST

    and

    SHOW FULL PROCESSLIST

    statements every now and then. This will let you see any queries that run often and way too long (but they’ll have to run when you issue the command). You might not catch the real culprit, but if you’re seing MySQL chugging along with 100% CPU and are wondering what’s happening, try to check out what the threads are doing. You’ll hopefully see just which query is wreaking havoc with your server.

  3. Add a statistics layer in front of your MySQL calls in your application. If you’re using PDO you can subclass it to keep a bit of statistics about your queries around. The number of times each query is run, the time it took in total running the query and other interesting values. We’re using a version of this in the development version of Gamer.no and I’ll probably upload the class to my github repository as soon as I get a bit of free time in the new year.

Not sure what I’ll take a closer look at tomorrow, but hopefully I’ll decide before everything collapses!

What are your strategy for indexes? What methods do you use for finding queries that need a bit more love? Leave a comment below!

Read all the articles in the Ready for 2010-series

Fatal error: Undefined class constant ‘ATTR_DEFAULT_FETCH_MODE’

This is one of the common error messages that seems to appear after installing PHP – in particular under Ubuntu or Debian (where I experienced it). The reason for this is that the PDO version you’ve just installed is too old for the constant to exist, usually because you where naive enough to install the extension from PECL instead of using the default supplied by ubuntu. If you did the same as me:

apt-get install 
pecl install pdo
pecl install pdo_mysql

/etc/init.d/apache2 restart

And everything seems to work, except for that missing constant. What the fsck?!

The reason is that the PECL version of PDO is no longer maintained (I’d suggest to automagically push the newest version to PECL too, just so pecl install and pecl upgrade works as expected). The fact is that when you did pecl install pdo, you destroyed the more recent version provided by the default php5-common package (under Ubuntu, at least).

To solve the problem (and if you also did pecl install pdo_mysql, you’ll have to replace that one too…)

apt-get install --reinstall php5-common php5-mysql php5-mysqli

Restart. Relive. Re.. eh .. yeah. Rerere.

If you’re building from source, you’ll need to add:

--enable-pdo --with-pdo-mysql --with-pdo-pgsql (and any other driver you need)

PDO and PDO::PARAM_INT

Hi there Mr. PDO!

We’ve come to know each other, and yes, while you have your troubles (.. which I don’t, of course), I’ve accepted your short comings. Today you threw another one of your fits, but I’ll be sure to document it for the world to see.

$statement = $pdo->prepare("
    ...
    LIMIT
        :offset, :hits
");

Yep. This will of course fail if you’re binding strings. ’10’, ’10’ is not very helpful now, is it. Good point. So let’s tell PDO that we’re really binding ints:

$statement->bindValue(':offset', $offset, PDO::PARAM_INT);
$statement->bindValue(':hits', $hits, PDO::PARAM_INT);

But wait. You’re still complaining?! I told you they were ints?! What’s the problem now?!?!

Well. Mr. PDO requires you to also convert the values for him. So first you have to convert the values of a loosely typed language to a strong type, then you have to tell the library that yes, this is in fact another type than what the library obviously assumes that it is. This works:

$statement->bindValue(':offset', (int) $offset, PDO::PARAM_INT);
$statement->bindValue(':hits', (int) $hits, PDO::PARAM_INT);

Which means the following:

If the type of your variable internally is a string, it’ll be escaped as a string, even if you tell PDO that it should be handled as an INT in your database layer.

If the type of your variable is an int, it’ll be handled as a string, unless you tell PDO it is an int.

Something is backwards here.

A Quick INSERT INTO Trick

From time to time you’re going to need to move some data from one table into another, in particular to generalize or specialize code. I’ve seen amazingly large code blocks written to handle simple cases as this, and here’s a simple trick you’re going to need some day:

INSERT INTO 
    
    (field_1, field_2) 
SELECT
    some_field, some_other_field 
FROM 
    

You can also add conditionals, i.e. to retrieve rows that might have been inserted yesterday, etc:

INSERT INTO 
    
(field_1, field_2) SELECT some_field, some_other_field FROM WHERE condition = 1

The good thing about this is that all data movement is kept within the database server, so that the data doesn’t have to travel from the server, to the client and then back from the client to the server again. It’s blazingly fast compared to other methods. You can also do transformations with regular SQL functions in your SELECT statement, which should help you do very simple operations at the speed of light.

Rounding Up The Remaining Database Posts

To finally be able to close my now-ready-to-be-archived Firefox-window, I’m rounding up the three other posts I were going to post about in one single batch here:

Ulf Wendel has a post up about PDO_MYSQLND: The new features of PDO_MYSQL in PHP 5.3. Besides being yet another introduction to how MYSQLND differs from the regular libmysqlclient, Ulf writes in detail about how mysqlnd brings other speedups to PDO in general, by allowing the drivers to return zvals directly. This allows the driver to return data without requiring an explicit copy by the overlying architecture. Interesting stuff and well worth a read for anyone, regardless if you actually know what a zval is.

Nicklas Westerlund has a post about MySQL Back to Basics: Analyze, Check, Optimize, and Repair on the pythian blog, featuring a overview of the useful – and abused – methods of rescuing and keeping your data intact. Do regular and good backups. It’s as easy as that. This might however help when you’re in a hurry or needs to fix a corruption that has occured. Read it.

The last item on today’s list is Sphinx – a free open-source full-text search engine. Sphinx uses it’s own indexing and retrieval system, while Solr is built on top of Lucene. Haven’t had much time to play with it yet, but it’s worth checking out. A native PHP module has also popped up (and that’s where I read about it just now), so if you need a fast and native PHP interface to a full search engine without blowing the big bucks, this may be what you’re looking for.

Drizzle – Making MySQL Leaner and Meaner

As noted I recently went for a total of five days of vacation. In the meantime, the intarwebs of blogospheres and the seas of web 2.0 exploded with posts and discussion over Drizzle. Drizzle is a fork of MySQL with the intention of making things more suitable for cloud computing and the regular web use cases. I’m actually quite intrigued by this, and I really look forward to getting some more time to read up on the issues at hand. It will be interesting to see how things compare to CouchDB, Amazon S3, Solr and other service that take a different road than regular relationial databases. Interesting. Anyways, here’s the posts I’d suggest checking out to get better and more usable information about Drizzle:

I’ll take it for a testdrive as soon as the first public version becomes available, and I’m looking forward to it. Might be fun!