This is my next blog series. Im going to write about how I automated many complex tasks in MySQL with Rundeck. In my last series, I have explained RunDeck basics. You can find those articles here. In this blog Im writing about how I automated MySQL archive for multiple tables in one Rundeck job.
Challeange with Replication:
My MySQL database setup has 1 Master 4 Read Replica and the 3’rd replica is an intermediate Master for Replica 4. I don’t want to archive this data on Replica 3 and 4. Because these replicas are using for generating historical reports also some internal application.
To prevent archive data on Replica 3 and 4, I decided to disable binlog on my archive session. But another challenge is, it won’t replicate to Replica 1 and 2. So my final solution is Archive the data on Master, then execute the archive scripts on Replica 1 and Replica 2.
I have 50+ tables which need to be archived older than 30days records. Each table has a timestamp column. But the column name is different on all the tables. For few tables, I need to select older than 30days based on multiple columns.
To securely pass MySQL password in shell script we are going to use RunDeck key storage. Read here how to save the password in Rundeck.
The job should follow the below steps and whenever a step failed, then stop executing further steps.
- Table dump with where clause (we are removing older than 30days data, so dump those 30days data).
- Sync the Dump files to cloud storage(here my complete infra in GCP).
- Restore the dump on Archive DB Server.
- Delete the records on Master.
- Delete the records on Replica 1.
- Delete the records on Replica 2.
Lets create the automation job.
Table name and Column names:
Create a file
/opt/rundeckfiles/archive_tables.txt contains table name and columns. We need to mention the table and use a comma to separate column names. The file structure would be,
Stored procedure to delete in chunks:
I have written a blog on how to perform archive operation in the right way. You can read it here. So we are going to delete this data with 1000 rows per chunk. Im maintaining DBA related functions and procedures in a separate database called
My database server has only one master db and going to archive this particular database. So in the procedure, I have mentioned my database name. Also, I used
SET sql_log_bin =OFF; because I don’t want to replica it to Replica 3 and 4. So if your use case is just archive on all the servers you can remove this line.
The above procedure will get the where clause from the shell script and prepare the complete delete statement.
Grab Table name & Column name from the file:
Shell script should get the first value from the file as table and rest of the values are column names in a line. So we need to separate this table and column names with comma.
To understand this process in more detail, Im giving an example. For executing archive stored procedure it should generate the
delete_query with table and where clause. For dump we just generate the where clause alone.
See the output:
Setup the Rundeck job:
- Go to Jobs –> Create Job.
- Give a name for this job.
- In the Options, add option.
- Option Name: sqladmin-secret
- Option Label: SQLadmin-Secret
- Input Type: Secure –> Select the mysql password from key storage.
- Go to Workflow –> Add step.
The complete process will be running from Rundeck server itself. It’ll use
-hto talk to DB servers.
My archive flow is to archive Master first and Replica1, 2. If your use case is just archive it on all the servers, then you must
SET sql_log_bin =OFF; in the archive procedure. And in step 3 add the step for Master IP address. Step 4 would be Delete old dump(See Step 6)
Step 1: Dump the Data & Upload to GCS Bucket
Here my backup files are upload to GCS bucket with current date’s folder. You can use AWS Cli, if you want to use it in AWS. Change the IP address of MySQL Master or if you want to take dump from slave use Slave IP in
Step 2: Restore the dump files to Archive DB:
Please use a separate db server for archive data. Restore the schema on the db server. Change the IP address of archive DB.
Step 3,4,5 : Archive it on Master/Replica1/Replica2:
Copy the above script and use the same for step 4 and 5 but change the IP address of the MySQL server
Step 6: delete older than 1day dumps:
We don’t want to keep the dump files. So the last step will remove those dump files.
Add tables in future:
If you want to add tables in future, just add the table name and archive where clause column name in the txt file.