Kafka Replication from MySQL and Oracle

Hello again everybody.

Well, I promised it a couple of weeks ago, and I’m sorry it has been so long (I’ve been working on other fun stuff in addition to this). But I’m pleased to say that we now have a fully working applier that takes data from an incoming THL stream, whether that is Oracle or MySQL, and converts that into a JSON document and message for distribution over a Kafka topic.

Currently, the configuration is organised with the following parameters:

  • The topic name is set according to the incoming schema and table. You can optionally add a prefix. So, for example, if you have a table ‘invoices’ in the schema ‘sales’, your Kafka topic will be sales_invoices, or if you’ve added a prefix, ‘myprefix_schema_table’.
  • Data is marshalled into a JSON document as part of the message, and the structure is to have a bunch of metadata and then an embedded record. You’ll see an example of this below. You can choose what metadata is included here. You can also choose to send everything on a single topic. I’m open to suggestions on whether it would be useful for this to be configured on a more granular level.
  • The msgkey is composed of the primary key information (if we can determine it), or the sequence number otherwise.
  • Messages are generated one row of source data at a time. There were lots of ways we could have done this, especially with larger single dumps/imports/multi-million-row transactions. There is no more sensible way. It may mean we get duplicate messages into Kafka, but these are potentially easier to handle than trying to send a massive 10GB Kafka message.
  • Since Zookeeper is a requirement for Kafka, we use Zookeeper to record the replicator status information.

Side note: One way I might consider mitigating that last item (and which may also apply to some of our other upcoming appliers, such as the ElasticSearch applier) is to actually change the incoming THL stream so that it is split into individual rows. This sounds entirely crazy, since it would separate the incoming THL sequence number from the source (MySQL binlog, Oracle, er, other upcoming extractors), but it would mean that we have THL on the applier side which is a single row of data. That means we would have a THL seqno per row of data, but would also mean that in the event of a problem, the replicator could restart from that one row of data, rather than restarting from the beginning of a multi-million-row transaction.

Anyway, what does it all look like in practice?

Well, here’s a simple MySQL instance and I’m going to insert a row into this table:

mysql> insert into sbtest.sbtest values (0,100,"Base Msg","Some other submsg");

OK, this looks like this:

mysql> select * from sbtest.sbtest where k = 100;
+--------+-----+----------+-------------------+
| id     | k   | c        | pad               |
+--------+-----+----------+-------------------+
| 255759 | 100 | Base Msg | Some other submsg |
+--------+-----+----------+-------------------+

Over in Kafka, let’s have a look what the message looks like. I’m just using the console consumer here:

{"_meta_optype":"INSERT","_meta_committime":"2017-05-27 14:27:18.0","record":{"pad":"Some other submsg","c":"Base Msg","id":"255759","k":"100"},"_meta_source_table":"sbtest","_meta_source_schema":"sbtest","_meta_seqno":"10130"}

And let’s reformat that into something more useful:

{
   "_meta_committime" : "2017-05-27 14:27:18.0",
   "_meta_source_schema" : "sbtest",
   "_meta_seqno" : "10130",
   "_meta_source_table" : "sbtest",
   "_meta_optype" : "INSERT",
   "record" : {
      "c" : "Base Msg",
      "k" : "100",
      "id" : "255759",
      "pad" : "Some other submsg"
   }
}

 

Woohoo! Kafka JSON message. We’ve got the metadata (and those field names/prefixes are likely to change), but we’ve also got the full record details. I’m still testing other data types and ensuring we get the data through correctly, but I don’t foresee any problems.

There are a couple of niggles still to be resolved:

  • The Zookeeper interface which is used to store state data needs addressing; although it works fine there are some occasional issues with key/path collisions.
  • Zookeeper and Kafka connection are not fully checked, so it’s possible to appear to be up and running when no connection is available.
  • Some further tweaking of the configuration would be helpful – for example, setting or implying specific formats for msg key and the embedded data.

I may add further configurations for other items, especially since longer term we might have a Kafka extractor and maybe we want to use that to distribute records, in which case we might want to track other information like the additional metadata and configuration (SQL mode etc) currently held within the THL. I’ll keep thinking about that though.

Anything else people would like to see here? Please email me at mc.brown@continuent.com and we’ll sort something out.

 

Percona Live 2017

So glad to have had a successful Percona Live last week. Continuent were Diamond Sponsors and now that we are back into a company and not part of VMware we have a little more freedom to get back into the MySQL community.

I had two primary sessions, both on the replicator/ But one was looking specifically at the replicator and how we get data into Big Data targets, the other on general problems of replicating between heterogeneous sources. After the first of those, David from Percona interviewed me to understand a bit more about what I was talking about

I was also on the keynote panel where we discussed a variety of different topics and you can see the full video of that through the link.

 

Replicating Oracle Webinar Question Follow-up

We had really great webinar on Replicating to/from Oracle earliest this month, and you can view the recording of that Webinar here.

A good sign of how great a Webinar was is the questions that come afterwards, and we didn’t get through them all. so here are all the questions and answers for the entire webinar.

Q: What is the overhead of Replicator on source database with asynchronous CDC?

A: With asynchronous operation there is no substantial CPU overhead (as with synchronous), but the amount of generated redo logs becomes bigger requiring more disk space and better log management to ensure that the space is used effectively.

Q: Do you support migration from Solaris/Oracle to Linux/Oracle?

A: The replication is not certified for use on Solaris, however, it is possible to configure a replicator to operate remotely and extract from a remote Oracle instance. This is achieved by installing Tungsten Replicator on Linux and then extracting from the remote Oracle instance.

Q: Are there issues in supporting tables without Primary Keys on Oracle to Oracle replication?

A: Non-primary key tables will work, but it is not recommended for production as it implies significant overhead when applying to a target database.

Q: On Oracle->Oracle replication, if there are triggers on source tables, how is this handled?

A: Tungsten Replicator does not automatically disable triggers. The best solution is to remove triggers on slaves, or rewrite triggers to identify whether a trigger is being executed on the master or slave and skip it accordingly, although this requires rewriting the triggers in question.

Q: How is your offering different/better than Oracle Streams replication?

A: We like to think of ourselves as GoldenGate without the price tag. The main difference is the way we extract the information from Oracle, otherwise, the products offer similar functionality. For Tungsten Replicator in particular, one advantage is the open and flexible nature, since Tungsten Replicator is open source, released under a GPL V2 license, and available at https://code.google.com/p/tungsten-replicator/.

Q: How is the integrity of the replica maintained/verified?

A: Replicator has built-in real-time consistency checks: if an UPDATE or DELETE doesn’t update any rows, Replicator will go OFFLINE:ERROR, as this indicates an inconsistent dataset.

Q: Can configuration file based passwords be specified using some form of encrypted value for security purposes to keep them out of the clear?

A: We support an INI file format so that you do not have to use the command-line installation process. There is currently no supported option for an encrypted version of these values, but the INI file can be secured so it is only readable by the Tungsten user.

Q: Our source DB is Oracle RAC with ~10 instances. Is coherency maintained in the replication from activity in the various instances?

A: We do not monitor the information that has been replicated; but CDC replicates row-based data, not statements, so typical sequence insertion issues that might occur with statement based replication should not apply.

Q: Is there any maintenance of Oracle sequence values between Oracle and replicas?

A: Sequence values are recorded into the row data as extracted by Tungsten Replicator. Because the inserted values, not the sequence itself, is replicated, there is no need to maintain sequences between hosts.

Q: How timely is the replication? Particularly for hot source tables receiving millions of rows per day?

A: CDC is based on extracting the data at an interval, but the interval can be configured. In practice, assuming there are regular inserts and updates on the Oracle side, the data is replicated in real-time. See https://docs.continuent.com/tungsten-replicator-3.0/deployment-oracle-cdctuning.html for more information on how this figure can be tuned.

Q: Can parallel extractor instances be spread across servers rather than through threads on the same server (which would be constrained by network or HBA)?

A: Yes. We can install multiple replicators and tune the extraction of the parallel extractor accordingly. However, that selection would need to be manual, but certainly that is possible.

Q: Do you need the CSV file (to select individual tables with the setupCDC.sh configuration) on the master setup if you want all tables?

A: No.

Q: If you lose your slave down the road, do you need to re-provision from the initial SCN number or is there a way to start from a later point?

A: This is the reason for the THL Sequence Number introduced in the extractor. If you lose your slave, you can install a new slave and have it start at the transaction number where the failed slave stopped if you know it, since the information will be in the THL. If not, you can usually determine this by examining the THL directly. There should be no need to re-provision – just to restart from the transaction in the THL on the master.

Q: Regarding a failed slave, what if it failed such that we don’t have a backup or wanted to provision a second slave such that it had no initial data.

A: If you had no backups or data, yes, you would need to re-provision with the parallel extractor in order to seed the target database.

Q: Would you do that with the original SCN? If it had been a month or two, is there a way to start at a more recent SCN (e.g. you have to re-run the setupCDC process)?

A: The best case is to have two MySQL slaves and when one fails, you re-provision it from the healthy one. This avoids setupCDC stage.

However, the replication can always be started from a specific event (SCN) provided that SCN is available in the Oracle undo log space.

Q: How does Tungsten handle Oracle’s CLOB and BLOB data types

A: Providing you are using asynchronous CDC these types are supported; for synchronous CDC these types are not supported by Oracle.

Q: Can different schemas in Oracle be replicated at different times?

A: Each schema is extracted by a separate service in Replicator, so they are independent.

Q: What is the size limit for BLOB or CLOB column data types?

A: This depends on the CDC capabilities in Oracle, and is not limited within Tungsten Replicator. You may want to refer to the Oracle Docs for more information on CDC: http://docs.oracle.com/cd/B28359_01/server.111/b28313/cdc.htm

Q: With different versions of Oracle e.g. enterprise edition and standard edition one be considered heterogeneous environments?

A: Essentially yes, although the nomenclature is really only a categorization, it does not affect the operation, deployment or functionality of the replicator. All these features are part of the open source product.

Q: Can a 10g database (master) send the data to a 11g database (slave) for use in an upgrade?

A: Yes.

Q: Does the Oracle replicator require the Oracle database to be in archive mode?

A: Yes. This is a requirement for Oracle’s CDC implementation.

Q: How will be able to revisit this recorded webinar?

A: Slides and a recording from today’s webinar will be available at http://www.slideshare.net/Continuent_Tungsten

 

Continuent at Hadoop Summit

I’m pleased to say that Continuent will be at the Hadoop Summit in San Jose next week (3-5 June). Sadly I will not be attending as I’m taking an exam next week, but my colleagues Robert Hodges, Eero Teerikorpi and Petri Versunen will be there to answer any questions you have about Continuent products, and, of course, Hadoop replication support built into Tungsten Replicator 3.0.

If you are at the conference, please go along and say hi to the team. And, as always, if there are any questions please let them or me know.

Real-Time Data Movement: The Key to Enabling Live Analytics With Hadoop

An article about moving data into Hadoop in real-time has just been published over at DBTA, written by me and my CEO Robert Hodges.

In the article I talk about one of the major issues for all people deploying databases in the modern heterogenous world – how do we move and migrate data effectively between entirely different database systems in a way that is efficient and usable. How do you get the data you need to the database you need it in. If your source is a transactional database, how does that data get moved into Hadoop in a way that makes the data usable to be queried by Hive, Impala or HBase?

You can read the full article here: Real-Time Data Movement: The Key to Enabling Live Analytics With Hadoop

 

Cross your Fingers for Tech14, see you at OSCON

i

So I’ve submitted my talks for the Tech14 UK Oracle User Group conference which is in Liverpool this year. I’m not going to give away the topics, but you can imagine they are going to be about data translation and movement and how to get your various databases talking together.

I can also say, after having seen other submissions for talks this year (as I’m helping to judge), that the conference is shaping up to be very interesting. There’s a good spread of different topics this year, but I know from having talked to the organisers that they are looking for more submissions in the areas of Operating Systems, Engineered Systems and Development (mobile and cloud).

If you’ve got a paper, presentation, or idea for one that you think would be useful, please go ahead and submit your idea.

I’m also pleased to say that I’ll be at OSCON in Oregon in July, handling a Birds of a Feather (BOF) session on the topic of exchanging data between MySQL, Oracle and Hadoop. I’ll be there with my good friend Eric Herman from Booking.com where we’ll be providing advice, guidance, experiences, and hoping to exchange more ideas, wishes and requirements for heterogeneous environments.

It’d be great to meet you if you want to come along to either conference.