Debezium MySQL Snapshot From Read Replica And Resume From Master
In my previous post, I have shown you how to take the snapshot from Read Replica with GTID for Debezium MySQL connector. GTID concept is awesome, but still many of us using the replication without GTID. For these cases, we can take a snapshot from Read replica and then manually push the Master binlog information to the offsets topic. Injecting manual entry for offsets topic is already documented in Debezium. I’m just guiding you the way to take snapshot from Read replica without GTID.
Setup master slave replication.
The slave must have log-slave-updates=ON else connector will fail to read from beginning onwards.
Debezium connector should be able to access the Read replica with a user that is having necessary permissions.
Install Debezium connector.
Use a different name for Slave binlog:
Note: If you are already having a Master slave setup then ignore this step.
By default, MySQL use mysql-bin as a prefix for all the mysql binlog files. We should not have the same binlog name for both the master and the slave. If you are setting up a new master-slave replication then make this change in my.cnf file.
Create a new database to test this sync and insert some values.
Create the MySQL Connector Config:
File Name: mysql.json
Run the below command to register the mysql connector.
Once the snapshot has been done, then it’ll push the binlog information of the Slave while taking the snapshot. And then it’ll start to continue to do CDC for the upcoming data. You will see the first record in your connect-offsets topic as like below.
Then for continuous replication, it’ll start adding the record to this topic along with some more addition metadata like, server id, timestamp.
You can monitor the snapshot progress from JMX.
Sometimes the metrics take a few more minutes to update. So once you are able to see the last binlog information from the connet-offsets and from JMX the lag <10, then the snapshot is done.
Switch to Master:
Before switching to the master, we need to stop the slave instance to get the consistent binlog information of Master from the Read replica. And then stop the Debezium connector to update binlog information manually in the connect-offsets topic.
To simulate the real-time scenario, we can add 1 new row in our MySQL table. So this will never replicate to your slave. But once you switch the node, it should start reading from this row.
Also create a new table and insert one new row to this new table.
Once the switchover has been done, then it should read the 6'th row that we inserted and a new topic should be created for the testtbl
Get the last binlog info from offsets:
Install kafkacat in you broker node. (it’s available from confluent repo)
Run the below command get the last read binlog info.
-b - Broker
-f lag takes arguments specifying both the format of the output and the fields to include.
You will get something like this.
Partition(0) - The Partition where the information is location.
mysql-connector-db01 Connector Name
"server":"mysql-db01" Server name that the connect has.
"ts-sec":1577764293,"file":"ip-172-31-25-99-bin.000002","pos":7305,"row":1,"server_id":1,"event":2 - Binlog information
Now we’ll manually push a new record inside this topic with the same information but just replace the binlog file name and its position. We need to continue the CDC where it stopped, so the get the exact starting binlog information we’ll use slave status from the Read replica.
Make a note of Master-log-file and Exec-Master-Log-Pos from the slave status. Now inject a new record to the offets topic.
If you read the data from this topic, you’ll see the manually injected record.
Once you start the Debezium MySQL connector, then it’ll start reading from the slave but it’ll start looking for the binlog file mysql-bin.000003 If you use the same binlog file name for both master and slave, then it’ll be a problem. So we can do any one of the following method to solve this.
Use different naming conversion for both Master and Slave binlog files.
Delete all the binlog files from the Slave using Reset master command.
If the binlog file in slave is having a file named as mysql-bin.000003 then delete this file alone.
If the binlog file in slave is having a file names as mysql-bin.000003 then rename this file as mysql-bin.000003.old
Disclaimer: Please consider with your DBA before performing any of the above steps. I recommend using step 1 or 4.
Start the debezium connector:
You in your connector log file, you can see there is an error indicating that the Debezium is not able to find the binlog file called mysql-bin.000003.
Now we need to update the existing MySQL connector’s config and just change the "database.hostname" parameter.
Note: this JSON file format is different from the one which we used to register the connector. So make sure the syntax.
File Name: mysql-update.json
Run the below command to update the config file.
Once the update is done, immediately it’ll start connecting to the master and start reading the binlog file mysql-bin.000003 from position 7759.
We inserted a new record to the rohi table. If you read this topic then you can see the row has been read. Also start inserting few more rows to this table with id 7 and 8.
Also, it should added the testtbl to the kafka topic.
Once your switchover is done, resume the replication on your slave.