SQLAlchemy, MySQL and UTF-8

While SQLAlchemy uses UTF-8 by default, the charset used when communicating with MySQL will affect the encoding of the returned data. To be sure that everything is handled properly as UTF-8 (which you might use SET NAMES 'utf8' in the console (don’t do that here..)), add ?charset=utf8 to your connection url:

mysql://user:password@localhost/database?charset=utf8

Thanks to RustyFluff at StackOverflow.

ERROR 1045 (28000): Access denied for user ‘debian-sys-maint’@’localhost’ (using password: YES)

Another error message from the attempt to move an existing MySQL installation to a new server, this time caused by the import of all the old users from the old server. debian (and Ubuntu) adds their own user to the server to be able to perform upgrades and other maintenance through automagical scripts. The password that debian (or Ubuntu) uses to connect to this user is stored in a file on the local file system: /etc/mysql/debian.cnf.

This file contains the current password for the debian-sys-maint user, and if you’re moving all your users to a new server, be sure to also get this password from the file on the old server and copy it into the file on the new server. I strongly suggest copying just the password and not the whole file, as there may be new settings added to the file since you last upgraded the old server.

ERROR 1577 (HY000) at line 1: Cannot proceed because system tables used by Event Scheduler were found damaged at server start

After moving a complete SQL dump from a MySQL 5.0 server to a MySQL 5.1 server I forgot about any issues stemming from the different versions of MySQL. Importing the old information_schema tables and any possible changes for the mysql table itself made it barf out the error message “ERROR 1577 (HY000) at line 1: Cannot proceed because system tables used by Event Scheduler were found damaged at server start” when trying to connect to the server.

The connection were still made, but obviously something were afoot. A few pages hinted in the direction of using mysql_upgrade to upgrade the now 5.0-compatible tables to the 5.1 format, but mysql_upgrade didn’t seem to want to run (maybe because of a new root password / debian-sys-maint password). The solution was to tell mysql_upgrade to ask for a password, which user it should connect as and which server it should connect to. You should run this command as root (preferrably through sudo) so that it can run mysql_fix_privilege_tables with the correct permissions:

sudo mysql_upgrade -u root -h localhost -p --verbose --force

The --force parameter will make the command run even if it thinks your server already has been upgraded to 5.1 (or the current version if you’re reading this at a later time). It may think this if you’ve already performed parts of the upgrade, but failed when trying to do the mysql_fix_privilege_tables-bit.
And away we went!

dpkg –configure Error for mysql-server-5.1

After starting the year by upgrading my Ubuntu desktop at work to 5.1 (and spending a couple of hours after borking my new kernel), my mysql-server had become lost in the transition.

Reinstalling the mysql-server package gave a cryptic error message:

Setting up mysql-server-5.1 (5.1.37-1ubuntu5) ...
Installing new version of config file /etc/init.d/mysql ...
 * Stopping MySQL database server mysqld                                 [ OK ] 
Warning: found usr.sbin.mysqld in /etc/apparmor.d/force-complain, forcing complain mode
 * Starting MySQL database server mysqld                                 [fail] 
invoke-rc.d: initscript mysql, action "start" failed.
dpkg: error processing mysql-server-5.1 (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of mysql-server:
 mysql-server depends on mysql-server-5.1; however:
  Package mysql-server-5.1 is not configured yet.
dpkg: error processing mysql-server (--configure):
 dependency problems - leaving unconfigured
Processing triggers for menu ...
No apport report written because the error message indicates its a followup error from a previous failure.
                          Errors were encountered while processing:
 mysql-server-5.1
 mysql-server
E: Sub-process /usr/bin/dpkg returned an error code (1)

Quite weird, yes. Trying to find anything in the mysql-logs didn’t yield anything either (both were empty), but luckily the syslog provided an answer this time. Among loads of other messages regarding mysql the answer hid from general view:

Jan  7 14:30:48 ubuntu mysqld_safe[20249]: 100107 14:30:48 [ERROR] /usr/sbin/mysqld: unknown option '--skip-bdb'
Jan  7 14:30:48 ubuntu mysqld_safe[20249]: 100107 14:30:48 [ERROR] Aborting

Taking a look at the mysql config file in /etc/mysql/my.cnf confirmed this:

#
# * BerkeleyDB
#
# Using BerkeleyDB is now discouraged as its support will cease in 5.1.12.
skip-bdb

Apparently not just the support ceased, but also the skip-bdb parameter. That seems a bit odd – I’d suggest at least keeping the parameter around for another release cycle, but anyhow:

#
# * BerkeleyDB
#
# Using BerkeleyDB is now discouraged as its support will cease in 5.1.12.
# skip-bdb

And suddenly everything went in the right general direction!

Ready for 2010: Check Your Indexes

One of the many things you should try to keep a continuous watch for during the life of any of your applications are the performance of your SQL queries. You might be caching the hell out of your database layer, but some time you’ll have to hit the database server to retrieve data. And if that starts to happen often enough while you’re growing, you will see your SQL daemon taking up the largest part of your disk io and your CPU time. This might not be a problem for the load you’re seeing now, but could you handle a 10 fold increase in traffic? .. or how about 100x? (which, if I remember correctly, is what Google uses as the scale factor when developing applications)

Indexes Are Your Friend

During the christmas holiday I got around to taking a look at some of the queries running at one of my longest living, most active sites: pwned.no. Pwned is a tournament engine running on top of PHP and MySQL, containing about 40.000 tournaments, 450.000 matches and several other database structures. The site has performed well over the years and there hasn’t been any performance issues other than a few attempts at DoS-ing the site with TCP open requests (the last one during the holiday, actually).

Two weeks ago the server suddenly showed loads well above 30 – while it usually hovers around 0.3 – 0.4 at the busiest hours of the day. The reason? One of the previously less used functions of the tournament engine, using a group stage in your tournament, had suddenly become popular in at least one high traffic tournament. This part of the code had never been used much before, but when the traffic spike happened everything went bananas (B-A-N-A-N-A-S. Now that’s stuck in your head. No problem.) The reason: the query used a couple of columns in a WHERE-statement that wasn’t indexed, and the query ran against the table containing the matches for the tournament. This meant that over 400.000 rows were scanned each time the query ran, meaning that mysqld started hogging every resource it could. The Apache childs then had to wait, making the load a bit too much for my liking. Two CREATE INDEX-calls later the load went back down and everything chugged along nicely again.

My strategy for discovering queries that might need a better index scheme (or if “impossible”, a proper caching layer in front of it):

  1. Run your development server with slow-query-log=1, log-queries-not-using-indexes=1 and long-query-time=<an appropriately low value, such as 0.05 – depends on your setup>. You can also provide a log file name with the log-slow-queries=/var/log/mysql/… in your my.cnf-file for MySQL. This will log all potential queries for optimizing to the log file (this will not necessarily provide you with a complete list of good queries to optimize, but it might provide a few good hints). Be sure to use actual data from your site when working on your development version, as you might start seeing issues when the size of the data set reaches a certain size – such as 400.000 rows in the example mentioned above)
  2. Connect to your MySQL server and issue
    SHOW PROCESSLIST

    and

    SHOW FULL PROCESSLIST

    statements every now and then. This will let you see any queries that run often and way too long (but they’ll have to run when you issue the command). You might not catch the real culprit, but if you’re seing MySQL chugging along with 100% CPU and are wondering what’s happening, try to check out what the threads are doing. You’ll hopefully see just which query is wreaking havoc with your server.

  3. Add a statistics layer in front of your MySQL calls in your application. If you’re using PDO you can subclass it to keep a bit of statistics about your queries around. The number of times each query is run, the time it took in total running the query and other interesting values. We’re using a version of this in the development version of Gamer.no and I’ll probably upload the class to my github repository as soon as I get a bit of free time in the new year.

Not sure what I’ll take a closer look at tomorrow, but hopefully I’ll decide before everything collapses!

What are your strategy for indexes? What methods do you use for finding queries that need a bit more love? Leave a comment below!

Read all the articles in the Ready for 2010-series

Fatal error: Undefined class constant ‘ATTR_DEFAULT_FETCH_MODE’

This is one of the common error messages that seems to appear after installing PHP – in particular under Ubuntu or Debian (where I experienced it). The reason for this is that the PDO version you’ve just installed is too old for the constant to exist, usually because you where naive enough to install the extension from PECL instead of using the default supplied by ubuntu. If you did the same as me:

apt-get install 
pecl install pdo
pecl install pdo_mysql

/etc/init.d/apache2 restart

And everything seems to work, except for that missing constant. What the fsck?!

The reason is that the PECL version of PDO is no longer maintained (I’d suggest to automagically push the newest version to PECL too, just so pecl install and pecl upgrade works as expected). The fact is that when you did pecl install pdo, you destroyed the more recent version provided by the default php5-common package (under Ubuntu, at least).

To solve the problem (and if you also did pecl install pdo_mysql, you’ll have to replace that one too…)

apt-get install --reinstall php5-common php5-mysql php5-mysqli

Restart. Relive. Re.. eh .. yeah. Rerere.

If you’re building from source, you’ll need to add:

--enable-pdo --with-pdo-mysql --with-pdo-pgsql (and any other driver you need)

PDO and PDO::PARAM_INT

Hi there Mr. PDO!

We’ve come to know each other, and yes, while you have your troubles (.. which I don’t, of course), I’ve accepted your short comings. Today you threw another one of your fits, but I’ll be sure to document it for the world to see.

$statement = $pdo->prepare("
    ...
    LIMIT
        :offset, :hits
");

Yep. This will of course fail if you’re binding strings. ’10’, ’10’ is not very helpful now, is it. Good point. So let’s tell PDO that we’re really binding ints:

$statement->bindValue(':offset', $offset, PDO::PARAM_INT);
$statement->bindValue(':hits', $hits, PDO::PARAM_INT);

But wait. You’re still complaining?! I told you they were ints?! What’s the problem now?!?!

Well. Mr. PDO requires you to also convert the values for him. So first you have to convert the values of a loosely typed language to a strong type, then you have to tell the library that yes, this is in fact another type than what the library obviously assumes that it is. This works:

$statement->bindValue(':offset', (int) $offset, PDO::PARAM_INT);
$statement->bindValue(':hits', (int) $hits, PDO::PARAM_INT);

Which means the following:

If the type of your variable internally is a string, it’ll be escaped as a string, even if you tell PDO that it should be handled as an INT in your database layer.

If the type of your variable is an int, it’ll be handled as a string, unless you tell PDO it is an int.

Something is backwards here.

A Quick INSERT INTO Trick

From time to time you’re going to need to move some data from one table into another, in particular to generalize or specialize code. I’ve seen amazingly large code blocks written to handle simple cases as this, and here’s a simple trick you’re going to need some day:

INSERT INTO 
    
    (field_1, field_2) 
SELECT
    some_field, some_other_field 
FROM 
    

You can also add conditionals, i.e. to retrieve rows that might have been inserted yesterday, etc:

INSERT INTO 
    
(field_1, field_2) SELECT some_field, some_other_field FROM WHERE condition = 1

The good thing about this is that all data movement is kept within the database server, so that the data doesn’t have to travel from the server, to the client and then back from the client to the server again. It’s blazingly fast compared to other methods. You can also do transformations with regular SQL functions in your SELECT statement, which should help you do very simple operations at the speed of light.

Unit-Testing Code Which Uses a Database

How to unit-test code that interacts with a database appeared on the blog of Baron Schwartz, and to be really boring, I agree with what he’s writing. Unit-testing database connectivity and storage is not hard. If it is, it might be a good time to redo that architecture you’ve been talking about.

An important point that Baron mentions is that you _NEVER_ _EVER_ run your tests on your production servers. That will of course be disastrous, as your tests needs a predefined state of the database to be valid for testing. The solution I’ve been using to handle this, is to always set up my environment to use another database when doing the tests. This way, you’ll never end up with running the tests on a live database by accident. I handle this in my AllTests.php file, where the test suites and shared fixtures are set up. We dump the contents of the developer database (databasename), create a new database (databasename_test) and insert all the current table structures and indexes. This way we get an accurate copy of the table definitions currently defined by the developer (so that we don’t run the tests against an old set of tables), and we test that the code works as it should with the active definitions.

The simplest way to do this, is to use mysqldump and mysql through a call to exec. If you’re not in a trusted environment, please, please, please add the appropriate shell argument escape commands. It can however be argued that if you’re allowing random people to change your database login information, you probably have bigger problems than doing unit testing..

exec('mysqldump -u ' . $username . ' -p' . $password . ' ' . $dbname . ' | mysql -u ' . $username . ' -p' . $password . ' ' . $database . '_test');

It would be very interesting to get more information about which measures Baron advocates for detecting a production system. We have configuration settings for our applications which also defines if this is a development or production system, in addition to the fact that our testing code only touches databases which end in _test.

Translating Drizzle to Norwegian

As Monty asked for help with translations of the current strings available in Drizzle on his blog yesterday, I sat down a couple of hours yesterday and a couple of hours today to at least attempt to contribute something to the project. As my primary language is Norwegian and I have some experience writing, I decided to tackle the Norwegian (Bokmål, not Nynorsk) translation of Drizzle. I’ve currently finished the 358 available messages, but I’d really appreciate it if someone spent a couple of minutes / hours to read through them and confirm that my assumptions are sane.

The most troubling part when it comes to definitions are the issue of MySQL/Drizzle’s ‘relay log’ which I translated into ‘replikasjonslogg’ – which mainly means “replication log”. This sounds much better in Norwegian, but suddenly the code mentioned both a “replication log” and a “relay log”. I tried finding out what the semantic difference in MySQL were, but were unable to grok anything from the MySQL manual or through a Google search. If anyone has any advice here, it’d be very appreciated. I also made a few notes of where there are obvious errors in the original english strings:

Error on close of '%'s (Errcode: %d)
 - Located in mysys/errors.c:28 

Errcode:
Can't read value for symlink '%s' (Error %d)
 - Located in mysys/errors.c:47 
Can't create symlink '%s' pointing at '%s' (Error %d)
 - Located in mysys/errors.c:48
Copy text Error on realpath() on '%s' (Error %d)
 - Located in mysys/errors.c:49 

%*s(Defaults to on; use --skip-%s to disable.)
 - Missing space
 - Located in mysys/my_getopt.c:1170 

The event could not be processed no other hanlder error happened
 - hanlder
 -  Located in mysys/my_handler_errors.h:118 

SSL information in the master info file ('%s') are ignored because this MySQL slave was compiled without SSL support.
 - MySQL
 - Located in drizzled/rpl_mi.cc:276 

Slave I/O thread killed while waitnig to reconnect after a failed registration on master
 - waitnig
 - Located in drizzled/slave.cc:90 

Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog'..
 - mysqlbinlog
 - Located in drizzled/slave.cc:1864 

Found wrong key definition in %s; Please do "ALTER TABLE '%s' FORCE " to fix it!
 - feil spaceplass
 - Located in drizzled/table.cc:1162 

Table '%-.64s' was created with a different version of MySQL and cannot be read
 - MySQL
 - Located in drizzled/table.cc:1818 

I could probably submit a patch for this, but seeing as the source is very much in flux these days, I think I’ll wait until it settles down a bit — unless someone is interesting in reviewing and committing an “unimportant” patch at this stage.

BTW: Launchpad worked great for doing translations, so I’m going to look into using gettext and Launchpad for doing translations for pwned.no and my other services in the future.