Revisiting ZFS and MySQL

While at Percona Live this year I was reminded about ZFS and running MySQL on top of a ZFS-based storage platform.

Now I’m a big fan of ZFS (although sadly I don’t get to use it as much as I used to after I shutdown my home server farm), and I did a lot of different testing back while at MySQL to ensure that MySQL, InnoDB and ZFS worked correctly together.

Of course today we have a completely new range of ZFS compatible environments, not least of which are FreeBSD and ZFS on Linux, I think it’s time to revisit some of my original advice on using this combination.

Unfortunately the presentations and MySQL University sessions back then have all been taken down. But that doesn’t mean the advice is any less valid.

Some of the core advice for using InnoDB on ZFS:

  • Configure a single InnoDB tablespace, rather than configuring multiple tablespaces across different disks, and then let ZFS manage the underlying disk using stripes or mirrors or whatever configuration you want. This avoids you having to restart or reconfigure your tablespaces as your data grows, and moves that out to ZFS which can do it much more easily and while the filesystem and database remain online. That means we can do:
innodb_data_file_path = /zpool/data/ibdatafile:10G:autoextend
  • While we’re taking about the InnoDB data files, the best optimisation you can do is to set the ZFS block size to match the InnoDB block size. You should do this *before* you start writing data. That means creating the filesystem and then setting the block size:
zfs set recordsize=8K zpool/data
  • What you can also do is configure a separate filesystem for the InnoDB logs that has a ZPool record size of 128K. That’s less relevant in later versions of ZFS, but actually it does no harm.
  • Switch on I/O compression. Within ZFS this improves I/O time (because less data is read/written physically from/to disk), and therefore improves overall I/O times. The compression is good enough and passive to be able to handle the load while still reducing the overall time.
  • Disable the double-write buffer. The transactional nature of ZFS helps to ensure the validity of data written down to disk, so we don’t need two copies of the data to be written to ensure valid recovery in the case of failure that are normally caused by partial writes of the record data. The performance gain is small, but worth it.
  • Using direct IO (O_DIRECT in your my.cnf) also improves performance for similar reasons. We can be sure with direct writes in ZFS that the information is written down to the right place. EDIT: Thanks to Yves, this is not currently supported on Linux/ZFS right now.
  • Limit the Adjustable Replacement Cache (ARC); without doing this you can end up with ZFS using a lot of cache memory that will be better used at the database level for caching record information. We don’t need the block data cache as well.
  • Configure a separate ZFS Intent Log (ZIL), really a Separate Intent Log (SLOG) – if you are not using SSD throughout, this is a great place to use SSD to speed up your overall disk I/O performance. Using SLOG stores immediate writes out to SSD, enabling ZFS to manage the more effective block writes of information out to slower spinning disks. The real difference is that this lowers disk writes, lowers latency, and lowers overall spinning disk activity, meaning they will probably last longer, not to mention making your system quieter in the process. For the sake of $200 of SSD, you could double your performance and get an extra year or so out the disks.

Surprisingly not much has changed in these key rules, perhaps the biggest different is the change in price of SSD between when I wrote these original rules and today. SSD is cheap(er) today so that many people can afford SSD as their main disk, rather than their storage format, especially if you are building serious machines.

Compiling MySQL Workbench on Gentoo

The Workbench team have just announced the release of Workbench for Linux, including binary packages and source packages with instructions on how to build. I’m a Gentoo Linux user, so I prefer building from source, and you’ll need to emerge the following packages (and note the USE) requirement as part of the source build process:

# USE="svg" emerge libzip libxml2 libsigc++     libglade libgtksourceviewmm media-libs/glut mysql lua     ossp-uuid libpcre libgnome gtk+ pango cairo

Depending on your config and platform, you may need to bypass some package masking by adding the packages to your /etc/portage/package.keywords file. Then download and install the ctemplate library from google code page. The current Gentoo version is 0.90, and you really should install the 0.91 version. With the required packages and libraries in place, download the Workbench sources and then build:

# cd mysql-workbench-5.1.4alpha# ./ make# make install

That should build and install MySQL Workbench for you. Just to confirm, here’s a screenshot of the built Workbench running on Gentoo Linux and displaying to my Mac OS X-based desktop.


MySQL Documentation and Debian/Ubuntu

We’ve got a lot of queries recently on the MySQL docs team address about the documentation (particularly man pages) for MySQL on Debian/Ubuntu. The source of the original problem was reported as a Debian bug. The assumption from the reading of the license in this instance is that you are not allowed to distribute MySQL documentation unless you’ve asked first, and that the documentation is not released under the GPL license. The original license was misunderstood in this respect. In fact, the license as originally quoted in that bug does allow you to provide the documentation if you are providing the MySQL software. In addition, regardless of how you interpret the license, all of our documentation, including installable man pages, has been available on You can find online HTML, offline HTML, PDF, CHM and the man pages for all of our reference manuals (on a version by version basis), along with the main HTML/PDF formats for all of the remaining documentation. We have never tried to limit the availability of the documentation (that’s why we provide it in so many formats).However, as soon as this issue was reported on to us by the folks at Debian we agreed with our legal department to put the man pages under a GPL license. This affects only the man pages, but gets round the misunderstanding above by allowing the man pages to be distributed under the same GPL license as the software. Why did we only change our man page license?MySQL documentation is updated and released very often, in fact as oftenas ten times per day. Allowing anyone to create static copies of anarbitrary documentation release would lead to many outdated copies onthe ‘Net. This is bad for users doing Google searches for MySQLdocumentation, and bad for us (we’ve seen complaints about “our” 5.0.0Manual being badly outdated when MySQL 5.1.20 was out). We appreciateanyone mirroring the MySQL Dev Zone which contains all MySQLdocumentation.So where does that leave the man pages in Debian/Ubuntu? I’m pleased to say that the new 5.0.51-1 package for MySQL that has gone into the latest Debian release (actually, in December). That means that MySQL and the corresponding man pages should appear already in the latest Debian “unstable” branch, and the next major Debian release should include everything you need. Thanks to Christian Hammers (Debian) and Norbert Tretkowski (Debian) for their help on getting this all sorted!

Mysterious crashes? – check your temporary directory settings

Just recently I seem to have noticed an increased number of mysterious crashes and terminations of applications. This is generally on brand new systems that I’m setting up, or on existing systems where I’m setting up a new or duplicate account. Initially everything is fine, but then all of a sudden as I start syncing over my files, shell profile and so on applications will stop working. I’ve experienced it in MySQL, and more recently when starting up Gnome on Solaris 10 9/07. Sometimes the problem is obvious, other times it takes me a while to realize what is happening and causing the problem. But in all cases it’s the same problem – my TMPDIR environment variable points to a directory that doesn't exist. That's because for historical reasons (mostly related to HP-UX, bad permissions and global tmp directories) I've always set TMPDIR to a directory within my home directory. It's just a one of those things I've had in my bash profile for as long as I can remember. Probably 12 years or more at least. This can be counterproductive on some systems - on Solaris for example the main /tmp directory is actually mounted on the swap space, which means that RAM will be used if it’s available, which can make a big difference during compilation. But any setting is counterproductive if you point to a directory that doesn’t exist and then have an application that tries to create a temporary file, fails, and then never prints out a useful trace of why it had a problem (yes, I mean you Gnome!). I’ve just reset my TMPDIR in .bash_vars to read:

case $OSTYPE in (solaris*) export set TMPDIR=/tmp/mc;mkdir -m 0700 -p $TMPDIR ;; (*) export set TMPDIR=~/tmp;mkdir -m 0700 -p $TMPDIR ;;esac

Now I explicitly create a directory in a suitable location during startup, so I shouldn’t experience those crashes anymore.

Setting up the developer stack issues

There’s a great post on Coding Horror about Configuring the Stack.Basically the gripe is with the complexity of installing the typical developer stack, in this case on Windows, using Visual Studio. My VS setup isn’t vastly different to the one Jeff mentions, and I have similar issues with the other stacks I use. I’ve just set up the Ultra3 mobile workstation again for building MySQL and other stuff on, and it took about 30 packages (from Sun Freeware) just to get the basics like gcc, binutils, gdb, flex, bison and the rest set up. It took the best part of a day to get everything downloaded, installed, and configured. I haven’t even started on modules for Perl yet. The Eclipse stack is no better. On Windows you’ll need the JDK of your choice, plus Eclipse. Then you’ll have to update Eclipse. Then add in the plugins and modules you want. Even though some of that is automated (and, annoyingly some of it is not although it could be), it generally takes me a few hours to get stuff installed. Admittedly on my Linux boxes it’s easier – I use Gentoo and copy around a suitable make.conf with everything I need in it, so I need only run emerge, but that can still take a day or so to get everything compiled.Although I’m sure we can all think of easier ways to create the base systems – I use Parallels for example and copy VM folders to create new environments for development – even the updating can take a considerable amount of time. I suggest the new killer app is one that makes the whole process easier.

Setting a remote key through ssh

One of the steps I find myself doing a lot is distributing round an ssh key so that I can login and use different machines automatically. To help in that process I created a small function in my bash profile script (acutally for me it’s in .bash_aliases):

function setremotekey{ OLDDIR=`pwd` if [ -z "$1" ] then echo Need user@host info fi cd $HOME if [ -e "./.ssh/" ] then cat ./.ssh/ |ssh $1 'mkdir -p -m 0700 .ssh && cat >> .ssh/authorized_keys' else ssh-keygen -t rsa cat ./.ssh/ |ssh $1 'mkdir -p -m 0700 .ssh && cat >> .ssh/authorized_keys' fi cd $OLDDIR}

To use, whenever I want to copy my public key to a remote machine I just have to specify the login and machine:

$ setremotekey mc@narcissus

Then type in my password once, and the the function does the rest. How? Well it checks to make sure I’ve entered a user/host (or actually just a string of some kind). Then, if I haven’t created a public key before (which I might not have on a new machine), I run the ssh-keygen to create it. Once the key is in place, I output the key text and then use ssh to pipe append that to the remote authorized_keys file, creating the directory along the way if it doesn’t exist. Short and sweet, but saves me a lot of time.

Extra bash improvements

If you’ve read my Getting the most out of bash article at IBM developerWorks then you be interested in some further bash goodness and improvements. Juliet Kemp covers some additional tricks on Improving bash to make working with bash easier. Some of the stuff there I have already covered, but the completion extensions might be useful if you like to optimize your typing. Even better, one of the comments provides the hooks to change your prompt to include your current CVS branch, another to include your current platform, and a really cool way of simplifying your history searching.

Copying multiple files with scp

I keep my .bash init scripts on one machine and copy them over to each machine on which I have a login. There’s various bits of logic in there to ensure that the right PATH and other values are set according to the host and/or platform. I then have a simple line that updates the .ocal .bash scripts from the main box that holds the main copies, so that I can just run:


To update everything. I use scp and, depending on the system, use a preset key or require a password.For copying multiple files there are many solutions; I could just use .bash*, but I’d also get the history and backup files. The typical advice is separate entries:

scp mc@narcissus:.bashrc mc@narcissus:.bash_aliases

This is less than optimal for a number of reasons – the first is that each location is treated individually, and that requires multiple connections and multiple password requirements. You can, though, use normal shell like expansion, just make sure you use quotes to ensure that it isn’t parsed and expanded by the local shell instead of the remote one:

scp mc@narcissus:".bash{rc,_path,_aliases,_vars}" ~

Stepped execution with cron and at

I had a query from a reader today as a follow up to my System Administrators Toolkit: Time and event management article at developerWorks:

How do I execute a script at a specific interval, for example 28 days, rather than on a specific day or date?

It is the one limitation of cron that it doesn’t support such an interval, although there are some systems (including many Linux installations) that provide an alternative method. There are some solutions to the problem that will work on any platform that uses the cron/at system.One way is to run the script every 7 days, and have it record how many times it’s been called in a file. All you have to do is, in the script, load the current count, work out if this is the fourth time, and run the script accordingly. For example: count=`cat counter`count=`expr $count + 1`if [ $count -eq 4 ]then echo 0 >counter echo 4th time called, going for it # Do everything elseelse echo $count >counterfiI suggest you put the counter file into a usable location, but you get the idea.The other alternative is to use at, rather than cron, and then add a line in the script to execute the script again in 28 days time. For example, using this line at the end of your script: at 9pm + 28 days <myscript .shBecause you are specifying the same time, but a different day, this will execute at the same time every 28 days. If your script takes a long time to process and you run it, for example, at 23:59, put the ‘at’ line at the start of the script, rather than the end, so that the request gets registered on the same day.

GNU/Linux – Fad or Favoritism picked up by Linux Today

I’ve picked up bloggin at FSM again, and my first post has been picked up by LinuxToday already. I urge you to read the article, but the basis of my thoughts are whether my current Linux distribution choices are because they are better than my previous ones (and therefore my favourites), or whether I’ve merely chosen them because they happen to popular at the moment (and are therefore just a passing fad).