End of an Era: Neither MC nor Continuent Attending Percona Live

Continuent have been a long term sponsor of the Percona Live conference, and the MySQL conference as it was before that, for many years. We have attended the conference both as a Diamond sponsor, and members of our staff attending and presenting our products and experience at the conference.

The nature of these conferences always changes over time, and we have seen over the last few years how the Percona Live conference has moved from being a pure MySQL conference to an open source database conference. Although Continuent continue to provide open source software and integrate with many open source databases, our core operation still revolves around MySQL clustering and replication for MySQL and Oracle.

Continuent is also evolving and changing and we are increasingly deploying and moving towards pure cloud-based environments, building and developing products that are used on the cloud or explicitly leverage cloud computing technology. We have a number of new products and initiatives specifically targeting these areas.

Over the course of the next year we will be releasing cloud editions of our clustering, replication and new backup and proxy services both directly and through our partners.

As such, this year we have made the difficult decision not to sponsor or attend the Percona Live conference, directing our energies to other conferences, webinars and meetups. We will be attending the AWS conference, for example, and we fully intend to be at some other select conferences this year dealing with analytics, in-memory computing, and cloud-based deployments.

To stay up-to-date with what Continuent are doing, keep reading the Continuent blog and follow us on Twitter and Facebook.

Keynote and Session at Percona Live Dublin 2017

On Sunday I will travel over to Dublin for Percona Live 2017.

I have two sessions, a keynote on the Wednesday morning where I’ll be talking about all the fun new stuff we have planned at Continuent and some new directions we’re working on.

I also have a more detailed session on our new appliers for Kafka, Elasticsearch and Cassandra, that’s Tuesday morning.

MCBrown-Keynote.jpg

If you haven’t already booked to come along, feel free to use the discount code SeeMeSpeakPLE17 which will get you 15% off!

Upcoming Webinar, 19th July, What is New in Tungsten Replicator 5.2 and Tungsten Clustering 5.2?

Continuent Tungsten 5.2 is just around the corner. This is one of our most exciting Tungsten product releases for some time!

In this webinar we’re going to have a look at a host of new features in the new release, including
Three new Replication Applier Targets (Kafka, Cassandra and Elasticsearch)
New improvements to our core command-line tools trepctl and thl
New foundations for our filtering services, and
Improvements to the compatibility between replication and clustering

This webinar is going be a packed session and we’ll show all the exciting stuff with more in-depth follow-up sessions in the coming weeks.

 

You’ll also learn about some more exciting changes coming in the upcoming Tungsten releases (5.2.1 and 5.3), and our major Tungsten 6.0 release due out by the end of the year.

So come and join us to get the low down on everything related to Tungsten Replicator 5.2 and Tungsten Clustering 5.2. on Wednesday, July 19, 2017 9:00 AM – 9:30 AM PDT at https://attendee.gotowebinar. com/register/ 4108437731342545667

New Continuent Webinar Wednesdays and Training Tuesdays

We are just starting to get into the swing of setting up our new training and webinar schedule.
Initially, there will be one Webinar session (typically on a Wednesday) and one training session (on a Tuesday) every week from now. We’ll be covering a variety of different topics at each.
Typically our webinars will be about products and features, comparisons to other products, mixed in with product news (new releases, new features) and individual sessions based on what is going on at Continuent and the market in general.
Our training, by comparison, is going to be a hands-on, step-by-step sequence covering all of the different areas of our product. So we’ll cover everything from the basics of how the products work, how to deploy them, typical functionality (switching, start/stop, etc), and troubleshooting.
All of the sessions are going to be recorded and we’ll produce a suitable archive page so that you can go and view the past sessions. Need a refresher on re-provisioning a node in your cluster? There’s going to be a video for it and documentation to back it up.
Our first webinar is actually next Thursday (the Wednesday rule wouldn’t be a good one without an exception) and is all about MySQL Multi-Site/Multi-Master Done Right:
In this webinar, we discuss what makes Tungsten Clustering better than other alternatives (AWS RDS, Galera, MySQL InnoDB Cluster, and XtraDBCluster), especially for geographically distributed multi-site deployments, both for disaster recovery and multi-site/multi-master needs.
If you want to attend, please go ahead and register using this link: http://go.continuent.com/n0JI04Q03EAV0RD0i000Vo4
Keep your eyes peeled for the other upcoming sessions. More details soon.

Kafka Replication from MySQL and Oracle

Hello again everybody.

Well, I promised it a couple of weeks ago, and I’m sorry it has been so long (I’ve been working on other fun stuff in addition to this). But I’m pleased to say that we now have a fully working applier that takes data from an incoming THL stream, whether that is Oracle or MySQL, and converts that into a JSON document and message for distribution over a Kafka topic.

Currently, the configuration is organised with the following parameters:

  • The topic name is set according to the incoming schema and table. You can optionally add a prefix. So, for example, if you have a table ‘invoices’ in the schema ‘sales’, your Kafka topic will be sales_invoices, or if you’ve added a prefix, ‘myprefix_schema_table’.
  • Data is marshalled into a JSON document as part of the message, and the structure is to have a bunch of metadata and then an embedded record. You’ll see an example of this below. You can choose what metadata is included here. You can also choose to send everything on a single topic. I’m open to suggestions on whether it would be useful for this to be configured on a more granular level.
  • The msgkey is composed of the primary key information (if we can determine it), or the sequence number otherwise.
  • Messages are generated one row of source data at a time. There were lots of ways we could have done this, especially with larger single dumps/imports/multi-million-row transactions. There is no more sensible way. It may mean we get duplicate messages into Kafka, but these are potentially easier to handle than trying to send a massive 10GB Kafka message.
  • Since Zookeeper is a requirement for Kafka, we use Zookeeper to record the replicator status information.

Side note: One way I might consider mitigating that last item (and which may also apply to some of our other upcoming appliers, such as the ElasticSearch applier) is to actually change the incoming THL stream so that it is split into individual rows. This sounds entirely crazy, since it would separate the incoming THL sequence number from the source (MySQL binlog, Oracle, er, other upcoming extractors), but it would mean that we have THL on the applier side which is a single row of data. That means we would have a THL seqno per row of data, but would also mean that in the event of a problem, the replicator could restart from that one row of data, rather than restarting from the beginning of a multi-million-row transaction.

Anyway, what does it all look like in practice?

Well, here’s a simple MySQL instance and I’m going to insert a row into this table:

mysql> insert into sbtest.sbtest values (0,100,"Base Msg","Some other submsg");

OK, this looks like this:

mysql> select * from sbtest.sbtest where k = 100;
+--------+-----+----------+-------------------+
| id     | k   | c        | pad               |
+--------+-----+----------+-------------------+
| 255759 | 100 | Base Msg | Some other submsg |
+--------+-----+----------+-------------------+

Over in Kafka, let’s have a look what the message looks like. I’m just using the console consumer here:

{"_meta_optype":"INSERT","_meta_committime":"2017-05-27 14:27:18.0","record":{"pad":"Some other submsg","c":"Base Msg","id":"255759","k":"100"},"_meta_source_table":"sbtest","_meta_source_schema":"sbtest","_meta_seqno":"10130"}

And let’s reformat that into something more useful:

{
   "_meta_committime" : "2017-05-27 14:27:18.0",
   "_meta_source_schema" : "sbtest",
   "_meta_seqno" : "10130",
   "_meta_source_table" : "sbtest",
   "_meta_optype" : "INSERT",
   "record" : {
      "c" : "Base Msg",
      "k" : "100",
      "id" : "255759",
      "pad" : "Some other submsg"
   }
}

 

Woohoo! Kafka JSON message. We’ve got the metadata (and those field names/prefixes are likely to change), but we’ve also got the full record details. I’m still testing other data types and ensuring we get the data through correctly, but I don’t foresee any problems.

There are a couple of niggles still to be resolved:

  • The Zookeeper interface which is used to store state data needs addressing; although it works fine there are some occasional issues with key/path collisions.
  • Zookeeper and Kafka connection are not fully checked, so it’s possible to appear to be up and running when no connection is available.
  • Some further tweaking of the configuration would be helpful – for example, setting or implying specific formats for msg key and the embedded data.

I may add further configurations for other items, especially since longer term we might have a Kafka extractor and maybe we want to use that to distribute records, in which case we might want to track other information like the additional metadata and configuration (SQL mode etc) currently held within the THL. I’ll keep thinking about that though.

Anything else people would like to see here? Please email me at mc.brown@continuent.com and we’ll sort something out.