My Sessions at UC2009

I’m speaking at the User Conference this year, with a half-day tutorial and three further sessions. The running theme is performance, both in terms of the performance of your queries, and in terms of scaling up. Scale Up, Scale Out, and High Availability: Solutions and CombinationsThis is the big tutorial. It’s difficult to resolve what I’ll be talking about into a few sentences, but think about all of the different technologies available here – replication, partitions, sharding, DRBD, memcached – I’ll be talking about all of them, and more importantly combinations of the different solutions and where the potential performance gains and pitfalls are. I’ll also be using the opportunity to demonstrate some of the more obscure combinations that you can use to provide the environment you need. How I used Query Analysis to Speed Up my ApplicationsFor query analysis, I’ll start with some of the basic methods available to us for performance monitoring, including EXPLAIN and DTrace, before I look at the query analysis provided by MySQL Enterprise Monitor. As an advisor to the group I’ve been looking at it for a while and used it on my own sites to identify a range of different query problems. Improving performance by running MySQL multiple timesIt isn’t talked about much, but there are times when running a single instance of MySQL doesn’t get you either the performance or environment that you need to support your applications. In this presentation I’m going to look at some of the benefits, from simply running multiple instances, to using solutions like VMware, Xen, LDOMs, BSD Jails, and Solaris Containers. Using MySQL with the Dojo ToolkitThe final presentation is something a little more fun. The Dojo Toolkit is a JavaScript kit for developing AJAX applications. There are some really fun things you can do with Dojo, but getting the best combination and cool and efficient with MySQL is an art. We’ll look at two quick examples; the first is a browsable interface to large quantities of data. The other is dynamic graphing using MySQL as the backend. If, within the bounds of any of these presentations there is something you would like covered, please let me know.

Compiling MySQL Workbench on Gentoo

The Workbench team have just announced the release of Workbench for Linux, including binary packages and source packages with instructions on how to build. I’m a Gentoo Linux user, so I prefer building from source, and you’ll need to emerge the following packages (and note the USE) requirement as part of the source build process:

# USE="svg" emerge libzip libxml2 libsigc++     libglade libgtksourceviewmm media-libs/glut mysql lua     ossp-uuid libpcre libgnome gtk+ pango cairo

Depending on your config and platform, you may need to bypass some package masking by adding the packages to your /etc/portage/package.keywords file. Then download and install the ctemplate library from google code page. The current Gentoo version is 0.90, and you really should install the 0.91 version. With the required packages and libraries in place, download the Workbench sources and then build:

# cd mysql-workbench-5.1.4alpha# ./autogen.sh# make# make install

That should build and install MySQL Workbench for you. Just to confirm, here’s a screenshot of the built Workbench running on Gentoo Linux and displaying to my Mac OS X-based desktop.

Media_httpcoalfacemcs_njfld

How to analyze memory leaks on Windows

We use valgrind to find memory leaks in MySQL on Linux. The tool is a convenient, and often enlightening way of finding out where the real and potential problems are location. On Windows, you dont have valgrind, but Microsoft do provide a free native debugging tool, called the user-mode dump heap (UMDH) tool. This performs a similar function to valgrind to determine memory leaks. Vladislav Vaintroub, who works on the Falcon team and is one of our resident Windows experts provides the following how-to for using UMDH:

  1. Download and install debugging tools for Windows from hereMS Debugging ToolsInstall 64 bit version if you’re on 64 bit Windows and 32 bit versionotherwise.

  2. Change the PATH environment variable to include bin directory of Debugging tools.On my system, I addedC:Program FilesDebugging Tools for Windows 64-bit to the PATH.

  3. Instruct OS to collect allocation stack for mysqld with gflags -imysqld.exe +ust.On Vista and later, this should be done in “elevated” command prompt,it requires admin privileges.

    Now collect the leak information. The mode of operation is that: take theheap snapshot once, and after some load take it once again. Comparesnapshots and output leak info.

  4. Preparation : setup debug symbol path.In the command prompt window, do

    set _NT_SYMBOL_PATH= srv*C:websymbols*http://msdl.microsoft.com/download/symbols;G:bzrmysql-6.0sqlDebug

    Adjust second path component for your needs, it should include directorywhere mysqld.exe is.

  5. Start mysqld and run it for some minutes
  6. Take first heap snapshot

    umdh -p:6768 -f:dump1

    Where -p: actually, PID of my mysqld was 6768.

  7. Let mysqld run for another some minutes
  8. Take second heap snapshot

    umdh -p:6768 -f:dump2

  9. Compare snapshots

    umdh -v dump1 dump2 > dump.compare.txt

  10. Examine the result output file. It is human readable, but all numbers arein hex, to scare everyone except geeks.
  11. gflags -i mysqld.exe -ust

    Instruct OS not to collect mysqld user mode stacks for allocationsanymore.

These are 10 steps and it sounds like much work, but in reality it takes 15minutes first time you do it and 5 minutes next time.Additional information is given in Microsoft KB article about UMDHKB 268343.

New VoiceXML/XQuery Demo

I’ve got a new VoiceXML/XQuery article coming out, and IBM have asked that a demo of the service is live. The service is an interface RSS reader – you get to choose the topic and the feed (currently only four static feeds are provided), then it will read out the feed content. You can try out the demo by calling:

  • Skype: +99000936 9991260725
  • US (freephone): (800) 289-5570, then using PIN 9991260725

Occasionally the hosting times out, in which case, please contact me and I’ll check it out and restart or reboot the service.

An introduction to Eclipse for Visual Studio users

I’m seeing more and more people moving to Eclipse as a development platform, even those Windows users who have traditionally used Visual Studio. As an Eclipse user for quite a while now I’m often asked how good it is, or how to use it. Of course, telling people to simply try it out isn’t enough. Many people just don’t get Eclipse and cannot understand or translate the skills and experience they already have to the Eclipse environment. That’s where An introduction to Eclipse for Visual Studio users can help. It’s a quick overview of the fundamentals of Eclipse from the perspective of a Visual Studio user. For a more in depth examination, there’s a tutorial Eclipse for Visual Studio developers, and another on migrating your applications from VS to Eclipse: Migrate Visual Studio C and C++ projects to Eclipse CDT.I can recommend any (or indeed all) of these.

Brian is having the same issues

I mentioned the problem with setting up the stack on a new Solaris box yesterday and then realized this morning that I’d already added Brian Aker’s blog posting on the same issues to my queue (Solaris, HOW-TO, It works… Really…). Brian mentions pkg-get, the download solution from Blastwave which I neglected to mention yesterday. It certainly makes the downloading and installation easier, but its’s far from comprehensive and some of the stuff is out of date. To be honest I find that I install the stuff from Sun Freeware to get me going, then spend time recompiling everything myself by hand, for the plain and simple reason that I then know it is up to date and/or working or both. This is particularly the case for Perl, which often needs an update of the entire perl binary to get the updated versions of some CPAN modules. Ultimately, though, it sucks.

Setting up the developer stack issues

There’s a great post on Coding Horror about Configuring the Stack.Basically the gripe is with the complexity of installing the typical developer stack, in this case on Windows, using Visual Studio. My VS setup isn’t vastly different to the one Jeff mentions, and I have similar issues with the other stacks I use. I’ve just set up the Ultra3 mobile workstation again for building MySQL and other stuff on, and it took about 30 packages (from Sun Freeware) just to get the basics like gcc, binutils, gdb, flex, bison and the rest set up. It took the best part of a day to get everything downloaded, installed, and configured. I haven’t even started on modules for Perl yet. The Eclipse stack is no better. On Windows you’ll need the JDK of your choice, plus Eclipse. Then you’ll have to update Eclipse. Then add in the plugins and modules you want. Even though some of that is automated (and, annoyingly some of it is not although it could be), it generally takes me a few hours to get stuff installed. Admittedly on my Linux boxes it’s easier – I use Gentoo and copy around a suitable make.conf with everything I need in it, so I need only run emerge, but that can still take a day or so to get everything compiled.Although I’m sure we can all think of easier ways to create the base systems – I use Parallels for example and copy VM folders to create new environments for development – even the updating can take a considerable amount of time. I suggest the new killer app is one that makes the whole process easier.

Building an RPN to Equation Parser

In the final part of the examination of lex and yacc, here are the rules for building a parser that translates RPN into equation input (the reverse of the Equation to RPN parser.Translating RPN into standard equation format is a lot more difficult. Although the fundamentals are similar to the RPN parser (we still use a stack for values that are popped off when we see an operand), it is the recording of that process is much more difficult. In the RPN calculator, we can place the result of the calculation back onto the stack so that the value can be used. To resolve something into the equation format we need to record the equivalent expression, not the value. For that, we use a temporary string, and then check if the temporary string has a value and append further expressions to that string. Also, to help precedence in the final calculation (a process handled automatically by the sequence of numbers an operands in RPN) we also enclose each stage of the calculation in parentheses. The resulting rules are shown below. Note that for the example, only the basic operands (+ – * /) are supported, but the principles are valid for any combination.

%%list:   /* nothing */        | list EOLN        | list expr EOLN        { printf( "%sn",exprstring); }        ;expr:   primary        | expr primary MUL          {            if (strlen(exprstring) > 0)              {                sprintf(tmpstring,"(%s * %g)",exprstring, pop());              }            else              {                sprintf(tmpstring,"( %g * %g )",pop(),pop());              }            strcpy(exprstring,tmpstring);          }        | expr primary DIV          {            temp=pop();            if (strlen(exprstring) > 0)              {                sprintf(tmpstring,"(%s / %g)",exprstring, temp);              }            else              {                sprintf(tmpstring,"( %g / %g )",pop(),temp);              }            strcpy(exprstring,tmpstring);          }        | expr primary PLUS          {            if (strlen(exprstring) > 0)              {                sprintf(tmpstring,"(%s + %g)",exprstring, pop());              }            else              {                sprintf(tmpstring,"( %g + %g )",pop(),pop());              }            strcpy(exprstring,tmpstring);          }        | expr primary MINUS          {            temp=pop();            if (strlen(exprstring) > 0)              {                sprintf(tmpstring,"(%s - %g)",exprstring, temp);              }            else              {                sprintf(tmpstring,"( %g - %g )",pop(),temp);              }            strcpy(exprstring,tmpstring);          }        ;primary: NUMBER { push($1); }        ;%%

You can see the resulting output below:

4 5 + 6 *(( 4 + 5 ) * 6)

As mentioned in the original IBM article, we can pipe sequences together to show the parsing and calculation of an expression from different formats. For example:

$ rpntoequ|calc    4 5 + 6 *54

And even rpntoequ and equtorpn:

$ rpntoequ|equtorpn  4 5 + 6 *4 5 + 6 *

The current RPN translator as shown here is not as advanced as the main RPN system, and so it doesn’t support all the options, or expression formats, but you can get the general idea. You can download the code for this example: rpntoequ.tar.gz (Unix).

Building an Equation to RPN Parser

As part of the continuing examination of lex and yacc, here are the rules for building a parser that translates equations into RPN format.The process is actually very simple. Because of the way the parser works, all you have to do is print out whatever component we see at each stage. For example, when you see a number, print it out, and when you see a operand, also print it out. The basic ruleset is shown below:

%%list:   /* nothing */        | list EOLN        | list expr EOLN        { printf( "n" ); }        ;expr:   shift_expr        ;shift_expr: pow_expr        | shift_expr LEFTSHIFT pow_expr { printf("> "); }        ;pow_expr: add_expr        | pow_expr POW add_expr { printf("^ "); }        ;add_expr: mul_expr        | add_expr PLUS mul_expr  { printf("+ "); }        | add_expr MINUS mul_expr { printf("- "); }        ;mul_expr: unary_expr        | mul_expr MUL unary_expr { printf("* "); }        | mul_expr DIV unary_expr { printf("/ "); }        | mul_expr MOD unary_expr { printf("% "); }        ;unary_expr: postfix_expr        | MINUS primary %prec UNARYMINUS { printf("-"); }        | INC unary_expr { printf("++ "); }        | DEC unary_expr { printf("-- "); }        ;postfix_expr: primary        | postfix_expr INC { printf("++ "); }        | postfix_expr DEC { printf("-- "); }        | postfix_expr FACT { printf("! "); }        ; primary: NUMBER { printf("%g ",$1); }        | PI { printf("%g ", M_PI); }        | OPENBRACKET expr CLOSEBRACKET { }        | function_call        ;function_call: SIN OPENBRACKET expr CLOSEBRACKET { printf("sin "); }        | COS OPENBRACKET expr CLOSEBRACKET { printf("cos "); }        | TAN OPENBRACKET expr CLOSEBRACKET { printf("tan "); }        | ASIN OPENBRACKET expr CLOSEBRACKET { printf("asin "); }        | ACOS OPENBRACKET expr CLOSEBRACKET { printf("acos "); }        | ATAN OPENBRACKET expr CLOSEBRACKET { printf("atan "); }        ;%%

Why does it work? It has to do with the parser evaluates the different components. When, for example, the parser identifies an addition with this rule:

add_expr: mul_expr        | add_expr PLUS mul_expr  { printf("+ "); }

The code that the parser generates evaluates the sub-rules first, and in both cases the rules will ultimately lead to the numerical value. Each time the number is seen, the value is printed. Once both rules have been resolved, it then matches the full expression and outputs the plus sign. In use, the parser generates all of the necessary RPN:

4+5*64 5 6 * + (4+5)*64 5 + 6 *

You can download the source for the equation to RPN parser: equtorpn.tar.gz (Unix)