MySQL on OpenSolaris Presentation/Transcript Now Available

As I mentioned earlier this week, I did a presentation on MySQL in OpenSolaris today. The presentation (audio and slides) is now viewable online (and downloadable), and you can also get hold of the transcript of the questions: here (or download). The original presentation is here. One minor difference from the presentation is that we have upgraded MySQL to 5.0.67 in 2008.11. I had forgotten we’d agreed to do this after the 5.1 pushback. Thanks to Matt Lord for the heads up.And thanks to everybody for attending. Up next week, memcached!

MySQL University: MySQL and OpenSolaris

On Thursday, November 13, 2008 (14:00 UTC / 14:00 BST / 15:00 CET), I’ll be presenting a MySQL University session on MySQL and OpenSolaris. The presentation will be similar to the presentation I did at the London OpenSolaris Users Group in July, you can see that presentation by visiting the LOSUG: July 2008 page. The presentation on thursday will be slightly different – I’ll be providing a bit more hands-on information about how to install MySQL, how to configure and change the configuration and some more detail on solutions like the Webstack and Coolstack distributions. I’ll also cover our plans for the inclusion of MySQL 5.1 in OpenSolaris, which will happen next year, and provide some examples on the new DTrace probes that we have been adding to MySQL generally. Of course, if there’s anything specific you want me to talk about, comment here and I’ll see if I can squeeze it into the presentation before thursday.

ZFS Replication for MySQL data

At the European Customer Conference a couple of weeks back, one of the topics was the use of DRBD. DRBD is a kernel-based block device that replicates the data blocks of a device from one machine to another. The documentation I developed for that and MySQL is available here. Fundamentally, with DRBD, you set up a physical device, configure DRBD on top of that, and write to the DRBD device. In the background, on the primary, the DRBD device writes the data to the physical disk and replicates those changed blocks to the seconday, which in turn writes the data to it’s physical device. The result is a block level copy of the source data. In an HA solution, which means that you can switch over from your primary host to your secondary host in the event of system failure and be sure pretty certain that the data on the primary and seconday are the same. In short, DRBD simplifies one of the more complex aspects of the typical HA solution by copying the data needed during the switch. Because DRBD is a Linux Kernel module you can’t use it on other platforms, like Mac OS X or Solaris. But there is another solution: ZFS. ZFS supports filesystem snapshots. You can create a snapshot at any time, and you can create as many snapshots as you like.Let’s take a look at a typical example. Below I have a simple OpenSolaris system running with two pools, the root pool and another pool I’ve mount at /opt:

Filesystem             size   used  avail capacity  Mounted onrpool/ROOT/opensolaris-1                       7.3G   3.6G   508M    88%    //devices                 0K     0K     0K     0%    /devices/dev                     0K     0K     0K     0%    /devctfs                     0K     0K     0K     0%    /system/contractproc                     0K     0K     0K     0%    /procmnttab                   0K     0K     0K     0%    /etc/mnttabswap                   465M   312K   465M     1%    /etc/svc/volatileobjfs                    0K     0K     0K     0%    /system/objectsharefs                  0K     0K     0K     0%    /etc/dfs/sharetab/usr/lib/libc/libc_hwcap1.so.1                       4.1G   3.6G   508M    88%    /lib/libc.so.1fd                       0K     0K     0K     0%    /dev/fdswap                   466M   744K   465M     1%    /tmpswap                   465M    40K   465M     1%    /var/runrpool/export           7.3G    19K   508M     1%    /exportrpool/export/home      7.3G   1.5G   508M    75%    /export/homerpool                  7.3G    60K   508M     1%    /rpoolrpool/ROOT             7.3G    18K   508M     1%    /rpool/ROOTopt                    7.8G   1.0G   6.8G    14%    /opt

I’ll store my data in a directory on /opt. To help demonstrate some of the basic replication stuff, I have other things stored in /opt as well:

total 17drwxr-xr-x  31 root     bin           50 Jul 21 07:32 DTT/drwxr-xr-x   4 root     bin            5 Jul 21 07:32 SUNWmlib/drwxr-xr-x  14 root     sys           16 Nov  5 09:56 SUNWspro/drwxrwxrwx  19 1000     1000          40 Nov  6 19:16 emacs-22.1/lrwxrwxrwx   1 root     root          48 Nov  5 09:56 uninstall_Sun_Studio_12.class -> SUNWspro/installer/uninstall_Sun_Studio_12.class

To create a snapshot of the filesystem, you use zfs snapshot, and then specify the pool and the snapshot name:

# zfs snapshot opt@snap1

To get a list of snapshots you’ve already taken:

# zfs list -t snapshotNAME                                         USED  AVAIL  REFER  MOUNTPOINTopt@snap1                                       0      -  1.03G  -rpool@install                               19.5K      -    55K  -rpool/ROOT@install                            15K      -    18K  -rpool/ROOT/opensolaris-1@install            59.8M      -  2.22G  -rpool/ROOT/opensolaris-1@opensolaris-1       100M      -  2.29G  -rpool/ROOT/opensolaris-1/opt@install            0      -  3.61M  -rpool/ROOT/opensolaris-1/opt@opensolaris-1      0      -  3.61M  -rpool/export@install                          15K      -    19K  -rpool/export/home@install                     20K      -    21K  -

The snapshots themselves are stored within the filesystem metadata, and the space required to keep them will vary as time goes on because of the way the the snapshots are created. The initial creation of a snapshot is really quick, because instead of taking an entire copy of the data and metadata required to hold the entire snapshot, ZFS merely records the point in time and metadata of when the snaphot was created.As you make more changes to the original filesystem, the size of the snapshot increases because more space is required to keep the record of the old blocks. Furthermore, if you create lots of snapshots, say one per day, and then delete the snapshots from earlier in the week, the size of the newer snapshots may also increase, as the changes that make up the newer state have to be included in the more recent snapshots, rather than being spread over the seven snapshots that make up the week. The result is that creating snapshots is generally very fast, and storing snapshots is very efficient. As an example, creating a snapshot of a 40GB filesystem takes less than 20ms on my machine. The only issue, from a backup perspective, is that snaphots exist within the confines of the original filesystem. To get the snapshot out into a format that you can copy to another filesystem, tape, etc. you use the zfs send command to create a stream version of the snapshot. For example, to write out the snapshot to a file:

# zfs send opt@snap1 >/backup/opt-snap1

Or tape, if you are still using it:

# zfs send opt@snap1 >/dev/rmt/0

You can also write out the incremental changes between two snapshots using zfs send:

# zfs send opt@snap1 opt@snap2 >/backup/opt-changes

To recover a snapshot, you use zfs recv which applies the snapshot information either to a new filesytem, or to an existing one. I’ll skip the demo of this for the moment, because it will make more sense in the context of what we’ll do next. Both zfs send and zfs recv work on streams of the snapshot information, in the same way as cat or sed do. We’ve already seen some examples of that when we used standard redirection to write the information out to a file. Because they are stream based, you can use them to replicate information from one system to another by combining zfs send, ssh, and zfs recv. For example, let’s say I’ve created a snapshot of my opt filesystem and want to copy that data to a new system into a pool called slavepool:

# zfs send opt@snap1 |ssh mc@slave pfexec zfs recv -F slavepool

The first part, zfs send opt@snap1, streams the snapshot, the second, ssh mc@slave, and the third, pfexec zfs recv -F slavepool, receives the streamed snapshot data and writes it to slavepool. In this instance, I’ve specified the -F option which forces the snapshot data to be applied, and is therefore destructive. This is fine, as I’m creating the first version of my replicated filesystem. On the slave machine, if I look at the replicated filesystem:

# ls -al /slavepool/total 23drwxr-xr-x   6 root     root           7 Nov  8 09:13 ./drwxr-xr-x  29 root
 root          34 Nov  9 07:06 ../drwxr-xr-x  31 root     bin           50 Jul 21 07:32 DTT/drwxr-xr-x   4 root     bin            5 Jul 21 07:32 SUNWmlib/drwxr-xr-x  14 root     sys           16 Nov  5 09:56 SUNWspro/drwxrwxrwx  19 1000     1000          40 Nov  6 19:16 emacs-22.1/lrwxrwxrwx   1 root     root          48 Nov  5 09:56 uninstall_Sun_Studio_12.class -> SUNWspro/installer/uninstall_Sun_Studio_12.class

Wow – that looks familiar!Once you’ve snapshotted once, to synchronize the filesystem again, I just need to create a new snapshot, and then use the incremental snapshot feature of zfs send to send the changes over to the slave machine again:

# zfs send -i opt@snapshot1 opt@snapshot2 |ssh mc@192.168.0.93 pfexec zfs recv slavepool

Actually, this operation will fail. The reason is that the filesystem on the slave machine can currently be modified, and you can’t apply the incremental changes to a destination filesystem that has changed. What’s changed? The metadata about the filesystem, like the last time it was accessed – in this case, it will have been our ls that caused the problem. To fix that, set the filesystem on the slave to be read-only:

# zfs set readonly=on slavepool

Setting readonly means that we can’t change the filesystem on the slave by normal means – that is, I can’t change the files or metadata (modification times and so on). It also means that operations that would normally update metadata (like our ls) will silently perform their function without attempting to update the filesystem state. In essence, our slave filesystem is nothing but a static copy of our original filesystem. However, even when enabled to readonly, a filesystem can have snapshots applied to it. Now it’s read only, re-run the initial copy:

# zfs send opt@snap1 |ssh mc@slave pfexec zfs recv -F slavepool

Now we can make changes to the original and replicate them over. Since we’re dealing with MySQL, let’s initialize a database on the original pool. I’ve updated the configuration file to use /opt/mysql-data as the data directory, and now I can initialize the tables:

# mysql_install_db --defaults-file=/etc/mysql/5.0/my.cnf --user=mysql

Now, we can synchronize the information to our slave machine and filesystem by creating another snapshot and then doing an incremental zfs send:

# zfs snapshot opt@snap2

Just to demonstrate the efficiency of the snapshots, the size of the data created during initialization is 39K:

# du -sh /opt/mysql-data/  39K        /opt/mysql-data

If I check the size used by the snapshots:

# zfs list -t snapshotNAME                                         USED  AVAIL  REFER  MOUNTPOINTopt@snap1                                     47K      -  1.03G  -opt@snap2                                       0      -  1.05G  -

The size of the snapshot is 47K. Note, by the way, that it is 47K in snap1, because currently snap2 should be more or less equal to our current filesystem state.Now, let’s synchronize this over:

# zfs send -i opt@snap1 opt@snap2|ssh mc@192.168.0.93 pfexec zfs recv slavepool

Note we don’t have to force the operation this time – we’re synchronizing the incremental changes from what are identical filesystems, just on different systems. And double check that the slave has it:

# ls -al /slavepool/mysql-data/

Now we can start up MySQL, create some data, and then synchronize the information over again, replicating the changes. To do that, you have to create a new snapshot, then do the send/recv to the slave to synchronize the changes. The rate at which you do it is entirely up to you, but keep in mind that if you have a lot of changes then doing it as frequently as once a minute may lead to your data becoming behind the because of the time taken to transfer the filesystem changes over the network – running snapshot with MySQL running in the background still takes comparatively little time. To demonstrate that, here’s the time taken to create a snapshot mid-way through a 4 million row insert into an InnoDB table:

# time zfs snapshot opt@snap3real    0m0.142suser    0m0.006ssys     0m0.027s

I told you it was quick :)However, the send/recv operation took a few minutes to complete, with about 212MB of data transferred over a very slow network connection, and the machine was busy writing those additional records.Ideally you want to set up a simple script that will handle that sort of snapshot/replication for you and run it past cron to do the work for you. You might also want to try ready-made tools like Tim Foster’s zfs replication tool, which you can find out about here. Tim’s system works through SMF to handle the replication and is very configurable. It even handles automatic deletion of old, synchronized, snapshots. Of course, all of this is useless unless once replicated from one machine to another we can actually use the databases. Let’s assume that there was a failure and we needed to fail over to the slave machine. To do:

  1. Stop the script on the master, if it’s still up and running.
  2. Set the slave filesystem to be read/write:
    # zfs set readonly=off slavepool
  3. Start up mysqld on the slave. If you are using InnoDB, Falcon or Maria you should get auto-recovery, if it’s needed, to make sure the table data is correct, as shown here when I started up from our mid-INSERT snapshot:
    InnoDB: The log sequence number in ibdata files does not matchInnoDB: the log sequence number in the ib_logfiles!081109 15:59:59  InnoDB: Database was not shut down normally!InnoDB: Starting crash recovery.InnoDB: Reading tablespace information from the .ibd files...InnoDB: Restoring possible half-written data pages from the doublewriteInnoDB: buffer...081109 16:00:03  InnoDB: Started; log sequence number 0 1142807951081109 16:00:03 [Note] /slavepool/mysql-5.0.67-solaris10-i386/bin/mysqld: ready for connections.Version: '5.0.67'  socket: '/tmp/mysql.sock'  port: 3306  MySQL Community Server (GPL)

Yay – we’re back up and running. On MyISAM, or other tables, you need to run REPAIR TABLE, and you might even have lost some information, but it should be minor. The point is, a mid-INSERT ZFS snapshot, combined with replication, could be a good way of supporting a hot-backup of your system on Mac OS X or Solaris/OpenSolaris. Probably, the most critical part is finding the sweet spot between the snapshot replication time, and how up to date you want to be in a failure situation. It’s also worth pointing out that you can replicate to as many different hosts as you like, so if you want wanted to replicate your ZFS data to two or three hosts, you could.

Resources for Running Solaris OS on a Laptop

As Solaris gets more and more popular I’m seeing more and more people running Solaris on a laptop as their primary operating system. I’ve even got friends who have migrated over completely to Solaris from Linux. I’ve been using it for years and managed to tolerate some of the problems we had in the early days, but today it works brilliantly on many machines.I came across this article on BigAdmin, it’s old, but a lot of the information is still perfectly valid.Read Resources for Running Solaris OS on a Laptop

Solaris 08/07 impressions

New versions of Solaris keep coming out apace, and with each new version we see improvements and enhancements, both from the core Solaris group and from the cooperation with the OpenSolaris teams. Solaris 10 8/07 came out last year, but it takes a while to go through and look at all of the new features, and in the time it’s taken me to fully understand all the new stuff, they’ve come out with another new version, but the functionality in the 8/07 release warrants some closer inspection. You can get the full details with the official documentation.First up, this is one of the the first releases where I feel comfortable using it pretty much out of the box without modification. Most of the tools and functionality are available with a standard install. You get Firefox 2.0 and Thunderbird 2.0 as standard, and combined with the improvements to the X Window System, we have an OS that can be used as a desktop without any additional installs. There are still some things I prefer to re-install (perl, for example), but the point is you no longer have to install many things. With Sun Studio being available, there’s little reason not to use Solaris 10 as a good replacement for Linux – in fact, one of the other elements I want to talk about might help there. Of course, its not all plain sailing. One of the problems I encountered is that I installed onto a machine with three identical drives – but when prompting you to partition disks, it shows you all of the disks without any kind of indication which one is which. You just get the size of the volume to choose from (a little difficult to choose when your machine has three identical drives). Luckily this was an empty machine that I could trash, but I wouldn’t want to identify the drive during installation just based on the partitions each device had. Actually, it turned out worse than this – after installation, Solaris had actually ended up on my second HD (the first attached to the second IDE controller), not my first. I had to swap the disks around to get it started properly. Really it’s a minor issue, but one to be wary of.One of the big improvements throughout are the changes and enhancements to the ZFS file system. One of my favourites is that you can now create a ZFS pool and then share it as an iSCSI device to other systems. This means that if I had a big server with loads of space, I could create a ZFS pool, get the benefits of ZFS on the server side, and share portions of the disk for storage over to other machines. Enabling iSCSI sharing is just a case of setting the shareiscsi

# zfs set shareiscsi=on mainpool/ishare

On a potential client, you add the server sharing the pool to the list of iSCSI targets, and then run format. With iSCSI, you are effectively sharing the physical device, so it just appears like a another disk, which you partition and organize as you want. I can see a real benefit here for providing users with a dedicated disk that they can organize how they like, while still giving you control the disk space and allocation at the server end. ZFS also has a huge range of other improvements. There’s no ZFS booting in this release (it’s in b84 of the regular OpenSolaris releases), but ZFS support and facilities just keeps going on and up. Using a single OS is often no longer a reasonable choice. Different companies release their products on different OS, and sometimes you just have to install another OS. Virtualization is one solution (and there’s plenty of news about that in later builds of Solaris), but in 8/07 we have another alternative, BrandZ. BrandZ builds on Solaris containers – logical divisions in the running environment that allow you to divide, allocate and isolate resources on a system into discrete, well, containers. Each container is effectively running another instance of Solaris (although it’s not as resource hungry as I make that sound). BrandZ extends that container functionality with an additional layer that provides binary compatibility for other operating systems, without going to the same level of complexity or resource requirements as the virtualization options. There are limits, of course, to that – you are not running a copy of the corresponding OS kernel, instead, an interface layer is providing the compatibility between the linux kernel and library calls and the corresponding interfaces on the Solaris side.You can’t run device drivers, and you also can’t run applications that require access to the graphics card; you have to run any X applications by using an X server in the global zone. That means you can’t run X.org in Linux, but since you can attach to the main X service, it still means you can run Firefox and other applications running as Linux binaries but displaying on your main host (or indeed, anywhere with a running X server). For the most part, that means you can run Linux binaries in a BrandZ zone without resorting to full virtualization. It works fine- I was able to run a whole range of apps without any problems. The downside? Currently it’s x86 only. I’d love to be able to run SPARC Linux apps on my SPARC servers, for testing purposes only it would fill a hole I otherwise have to fill with another machine running another OS. BrandZ is a convenient way to run Linux applications without running a full Linux distro either on another machine or in a virtualization. This means it will appeal either to the user with a Linux binary legacy application, or the casual Linux application user without the complexities of a VM layer or dual booting. A good example here is something like Skype, available on Linux, but not Solaris. There are many examples of people using BrandZ on Solaris (see here for an example). All in all, there’s a lot to like in this release. ZFS is maturing and extending nicely, and I certainly want to play with the iSCSI feature a bit more to see where it can be used effectively (I’ve got a new server on it’s way soon to help with that test) .BrandZ, on the other hand, will solve the issue of running Linux applications to the point where I dont need VMware or Parallels for some jobs. I wont completely eliminate the need, but for one of my desktops it will make life much easier.

Comparing 32-bit/64-bit MySQL on OpenSolaris

I’ve been working with the folks working on OpenSolaris for a few months now providing advice and input on getting MySQL and the connectors (C/ODBC and C/J) installed as a standard component. Having got the basics in, the team are now looking at adding both 32-bit and 64-bit packages. The question raised at the end of last week was whether OpenSolaris should enable 64-bit builds by default in 64-bit installations, and whether there was a noticeable performance difference that would make this worthwhile. I did some initial tests on Friday which showed that there was a small increase (10-15%) of the packaged 64-bit installations over 32-bit under x86 using snv_81. Tests were executed using the included sql-bench tool, and this was a single execution run of each package for 5.0.56. Transactions are missing because I hadn’t enabled transactions in the tests.

Test (x86, binary packages) 32-bit 64-bit +/-
ATIS 20 17 17.65%
alter-table 18 15 20.00%
big-tables 14 11 27.27%
connect 134 121 10.74%
create 348 348 0.00%
insert 1038 885 17.29%
select 399 257 55.25%
transactions
wisconsin 10 8 25.00%

There are some significant differences there (like the 55% increase on SELECT speeds, for example), but a single execution is never a good test. Also, it’s unclear whether the differences are between the compilations, the platform or just pure coincidence. This requires further investigation. As a coincidence, Krish Shankar posted these notes on using SunStudio 11 and SunStudio 12 and the right compiler flags to get the best optimization. I decided to do 10-pass iterations of sql-bench and compare both 32-bit and 64-bit standard builds, the 32-bit standard builds against Krish’s optimizations, and finally 32-bit and 64-bit optimized builds.

Some notes on all the tests:

  • All builds are 5.0.56
  • All tests are run on SunOS 5.11, snv_81
  • Tests are executed on the same OS and machine running in 64-bit. The SPARC tests are on an UltraSPARC IIIi@1.28GHz Workstation with 1GB RAM; x86 are on a Dell T105, Opteron 1212 with 4GB RAM. Of course we’re not comparing machine speed, just 32-bit binaries over 64-bit.
  • All results are in seconds; lower values mean faster performance.
  • In all tests I’m using the built-in defaults (i.e. no my.cnf anywhere) so as to simulate a standardized installation.

Let’s first look at x86 and the 32-bit standard and 32-bit optimized builds:

Test (x86, 32-bit) 32-bit (standard) 32-bit (optimized) +/-
ATIS 15.4 21 -26.67%
alter-table 15 16.3 -7.98%
big-tables 13.7 12.5 9.60%
connect 77.6 133 -41.65%
create 343.7 350.6 -1.97%
insert 760.3 1043.8 -27.16%
select 394.8 384.2 2.76%
transactions 10.8 18.6 -41.94%
wisconsin 6.6 10.1 -34.65%

The standard build uses gcc instead of SunStudio, but I don’t get the same performance increases that Krish saw – in fact, I see reductions in performance, not improvements at all. I’m going to rebuild and retest, because I’m convinced there’s a problem here with the builds that I’m not otherwise seeing. I certainly don’t expect to get results that show a 27% reduction in insert speed. That said, a 10% big-table increase is interesting. I’ll redo these builds and find out if the slow down is as marked as it here.Here’s the comparison for standard builds between 32-bit and 64-bit standard builds on x86:

Test (x86, standard) 32-bit 64-bit +/-
ATIS 15.4 13.5 14.07%
alter-table 15 10.6 41.51%
big-tables 13.7 10.6 29.25%
connect 77.6 76.4 1.57%
create 343.7 346 -0.66%
insert 760.3 681.6 11.55%
select 394.8 254.8 54.95%
transactions 10.8 10.7 0.00%
wisconsin 6.6 5.8 13.79%

There are some incredible differences here – more than 50% increase in SELECT, and 30% for the big-tables test show that there is some advantage to having the 64-bit builds on x86 enabled.Unfortunately I’ve had problems with the 64-bit optimized builds on my machine, so I haven’t completed optimized test comparisons.On SPARC, Sun Studio is used as the default compiler, and the standard 32-bit and 64-bit show little difference:

Test (SPARC, standard) 32-bit 64-bit +/-
ATIS 28.6 27.5 4.00%
alter-table 27 26.7 1.12%
big-tables 26.9 29.4 -8.50%
connect 166.3 173.6 -4.21%
create 155 143.1 8.32%
insert 1577.3 1572.3 0.32%
select 807.4 761.6 6.01%
transactions 19.5 18.75 4.00%
wisconsin 11.1 11.4 -2.63%

Overall, a pretty insignificant difference here. Now let’s compare the standard and optimized builds using Krish’s flags on SPARC:

Test (SPARC) 32-bit (standard) 32-bit (optimized) +/-
ATIS 28.6 27.75 3.06%
alter-table 27 26.25 2.86%
big-tables 26.9 25 7.60%
connect 166.3 162.5 2.34%
create 155 145.25 6.71%
insert 1577.3 1551.5 1.66%
select 807.4 769.625 4.91%
transactions 19.5 16.875 15.561%
wisconsin 11.1 10.875 2.07%

The tests here show little significant difference between the standard and the optimized builds, although 6-7% would probably be enough to prefer an optimized build if you wanted to build your own. Now let’s compare the optimized, Sun Studio 12 builds running in 32-bit and 64-bit:

Test (SPARC, optimized) 32-bit 64-bit +/-
ATIS 27.75 27.3 1.65%
alter-table 26.25 26.6 -1.32%
big-tables 25 25 0.00%
connect 162.5 162 0.31%
create 145.25 154.3 -5.87%
insert 1551.5 1535.1 1.07%
select 769.625 771.2 -0.20%
transactions 16.875 19.1 -11.65%
wisconsin 10.875 10.7 1.64%

The differences are virtually non-existent, and taking the reductions and increases in performance overall, there’s probably little
difference.The overall impression is that on x86 the improvement of 64-bit over 32-bit is significant enough that it’s probably a good idea to make 64-bit the default. On SPARC, the difference in the optimized builds is so slight that for compatibility reasons alone, 32-bit would probably make a better default.I’ll probably be re-running these tests over the next week or so (particularly the x86 so I can get a true comparison of the 64-bit optimized improvements), and I’ll try the T1000 which I upgraded to snv_81 over the weekend, but I think indications are good enough to make a reasonable recommendation of 64-bit over 32-bit.

Getting Best out of MySQL on Solaris

I’m still working up some good tips and advice on MySQL on Solaris (particularly the T1000, the new x86 based servers like the X4150 and ZFS, amongst other things), but until then I found Getting Best out of MySQL on Solaris while doing some research.With the latest OpenSolaris builds (b79, from memory) we now have MySQL built-in, and I worked with the folks on the OpenSolaris Database team to get some reasonable configurations and defaults into the system. MySQL 5.1 and 64-bit support is currently going through the process and will be a in future build. I’ve also been working with the DTrace people to improve the DTrace support we have in MySQL (documentation will go live this week, I hope). MySQL 6.0.4 will have some basic DTrace probes built-in, but I’ve proposed a patch to extend and improve on that significantly. We’re in the process of updating the documentation and advice on Solaris (and OpenSolaris) installations and layout too, which is itself part of a much larger overhaul of the installation and setup instructions for all platforms.

Mysterious crashes? – check your temporary directory settings

Just recently I seem to have noticed an increased number of mysterious crashes and terminations of applications. This is generally on brand new systems that I’m setting up, or on existing systems where I’m setting up a new or duplicate account. Initially everything is fine, but then all of a sudden as I start syncing over my files, shell profile and so on applications will stop working. I’ve experienced it in MySQL, and more recently when starting up Gnome on Solaris 10 9/07. Sometimes the problem is obvious, other times it takes me a while to realize what is happening and causing the problem. But in all cases it’s the same problem – my TMPDIR environment variable points to a directory that doesn't exist. That's because for historical reasons (mostly related to HP-UX, bad permissions and global tmp directories) I've always set TMPDIR to a directory within my home directory. It's just a one of those things I've had in my bash profile for as long as I can remember. Probably 12 years or more at least. This can be counterproductive on some systems - on Solaris for example the main /tmp directory is actually mounted on the swap space, which means that RAM will be used if it’s available, which can make a big difference during compilation. But any setting is counterproductive if you point to a directory that doesn’t exist and then have an application that tries to create a temporary file, fails, and then never prints out a useful trace of why it had a problem (yes, I mean you Gnome!). I’ve just reset my TMPDIR in .bash_vars to read:

case $OSTYPE in (solaris*) export set TMPDIR=/tmp/mc;mkdir -m 0700 -p $TMPDIR ;; (*) export set TMPDIR=~/tmp;mkdir -m 0700 -p $TMPDIR ;;esac

Now I explicitly create a directory in a suitable location during startup, so I shouldn’t experience those crashes anymore.

Brian is having the same issues

I mentioned the problem with setting up the stack on a new Solaris box yesterday and then realized this morning that I’d already added Brian Aker’s blog posting on the same issues to my queue (Solaris, HOW-TO, It works… Really…). Brian mentions pkg-get, the download solution from Blastwave which I neglected to mention yesterday. It certainly makes the downloading and installation easier, but its’s far from comprehensive and some of the stuff is out of date. To be honest I find that I install the stuff from Sun Freeware to get me going, then spend time recompiling everything myself by hand, for the plain and simple reason that I then know it is up to date and/or working or both. This is particularly the case for Perl, which often needs an update of the entire perl binary to get the updated versions of some CPAN modules. Ultimately, though, it sucks.