Versioned MySQL Replica
Last updated
Last updated
Dolt can be configured as a MySQL Replica.
In this mode, you set up Dolt to replicate a primary MySQL. Set up can take as as little as three commands. After set up, Dolt replicates every write to your primary to Dolt and creates a Dolt commit, giving you time travel, lineage, rollback, and other features on your Dolt replica.
This document will walk you through step-by-step on how to get Dolt running as a MySQL replica on your host. It will show off some unique version control features you get in this set up, including finding and fixing a bad change on primary.
First, we need a running MySQL instance. We'll consider this our "primary" database.
I use homebrew on my Mac. To get MySQL and start it, I open a terminal and run:
This starts a MySQL on port 3306
. I can connect to it with:
Simple enough. Keep that open. You're going to need it.
Now, you have to prepare MySQL to have a replica. This requires the following configuration which are on by default. So, don't touch these if you're starting fresh like me.
BINLOG_FORMAT
must be set to ROW
, which is the default in MySQL 8.0.
LOG_BIN
must be set to ON
, which is the default in MySQL 8.0.
SERVER_ID
must be set to any positive integer (so long as there is not a replica server using that same ID). The default here is 1.
Now, for the things you have to change. First you need to turn on ENFORCE_GTID_CONSISTENCY
. Go to the you mysql
client we started in step one and run the following query.
To make sure you have everything set up right, run the following and make sure the table looks the same.
You're now ready to configure a Dolt replica.
Dolt is a single ~68 megabyte program.
It's really easy to install. Download it and put it on your PATH
.
Here is a convenience script that does that for *NIX
platforms. Open a terminal and run it.
Dolt needs a place to put your databases. I put mine in ~/dolt_replica
.
Start a dolt sql-server
. This is a MySQL compatible database server similar to the one you started in the first section. You need to run it on a different port than 3306
because your MySQL is running there. So, we'll start it on port 1234
using the -P
option. I'm also going to start the server with --loglevel
at debug
so I can show you the queries replication is running.
The shell will just hang there. That means Dolt is running. Any errors you encounter running Dolt will be printed here. Because we're in debug mode, you can also see the queries run against the server in this log.
Open a new terminal and connect a MySQL client to your running Dolt just like you did to MySQL above, but this time specify port 1243
and host 127.0.0.1
to force MySQL through the TCP interface. Without the host specified it will connect using the socket interface to your running MySQL, not Dolt like you want.
Now you need to run three commands. First, make sure the replica server has a unique server_id
not used by any source or other replicas in our system. We'll use `2'.
Then, set the replication source to the primary, localhost:3306
with user root
, no password.
Finally, start the replica.
If you look at the logs in the Dolt terminal, you should see something like this:
You now have a Dolt replica of a running MySQL! If you have any data in the binlog
of the MySQL (ie. you didn't start from scratch), that data will replicate over to Dolt right now.
Now it's time to test out what you created. We'll start by creating a databases and a table.
On the primary, run the following queries.
You can see the queries replicating in the Dolt log:
All seems to be working.
Let's hop over to the mysql
client connected to Dolt and inspect it just to make sure.
The new database foo
and the table t
along with it's single row have replicated.
Now to show off the first new feature, the Dolt Commit log. The Dolt replica makes a Dolt commit after every transaction sent from the primary so you can see what changed and when.
As you can see, we have a full audit history of the database going back to inception. That should be useful.
It looks like c1
and c2
both went from NULL
to 0
in that commit, just as we'd expect. With a Dolt replica, you get a queryable audit log of every cell in your database.
The export data is 167 MB, which is not huge, but heavy enough to be non-trivial for testing.
It takes a bit longer than 25 seconds for Dolt to replicate that all over, especially with the log level at debug
, but it should be done in under a minute.
Over on my Dolt replica, I can see the employees database and its tables.
Now, let's do something crazy. We're going to mix a bad change in with a couple good changes and use the Dolt replica to find and revert the change.
Let's get a feel for this database first.
Now, I'll make some changes.
I'm going to add myself as an employee and pay myself $1,000,000!
Let's mix in a bad change. Think about how disastrous this query would be without a Dolt replica.
Finally, I'll add my co-founders Aaron and Brian to the database and give them salaries.
Let's say in this case, people are reporting their historical salaries have changed. We have a clue that something is wrong in the database. Let's head over to the Dolt replica and see what's up.
It looks like the bottom seven are the import because they are happening back to back. Let's look at the diff between the third and first commit, 4te2i1qheceek434m3uoqsuejfv6f0nu
and 649fgvojpthn1e21bqrdlv3r3bht4rqb
.
We're getting closer. We have a couple good changes in there, the last two, but we see the badly modified rows pretty clearly. Let's compare the last two commits.
That looks like the good changes so we've narrowed down the bad commit to 123d9jc85evssjcrv6u5mlt5dg4lk6ss
.
Take a minute to marvel at what we just did. We were able to identify what changed from a update salaries set salary=salary-1 order by rand() limit 5;
query all using queryable system tables on a replica. What a time to be alive! Now let's revert the change.
We now have the SQL we need to run on the primary. Let's run it and make sure there is no diff between main
and the revert_bad_change
branch we're working on after it replicates. This is now in my primary mysql shell.
And back over on Dolt we should now see no difference between main
and our branch revert_bad_change
.
Phew, that was pretty awesome. I was able to find a bad change using my Dolt replica, generate a patch to revert it on my primary, apply the patch, and make sure everything was right with the world again. With a Dolt replica, you never have to worry about bad administrator queries again!
Finally, you have to change GTID_MODE
to ON
. It is OFF
by default and . So, step through the options up to ON
like so.
Dolt has a number of , , and to expose the version control features. These tables, functions, and procedures are inspired by their Git equivalents, so git log
becomes the dolt_log
system table.
Let's see what happened in the last transaction. I'm going to see what changed in table in the last commit using the .
Now, let's examine a more interesting database. This time, I'm going to use recommended by . As the :
First, I clone the GitHub repo and import the data to my running MySQL using the in the test_db repository.
First, we want to find the changes in the last 10 transactions that touched the salaries table. To do this we use the unscoped to see what tables changes in each commit.
If you were running Dolt as the primary database, reverting a bad change is as simple as calling . But since we're running Dolt as a replica, we need Dolt to produce a SQL patch to revert the bad changes. To do this, we're going to make a branch on the replica, revert the change, and then use the to get the SQL we need to apply to our primary database.
First, we use to create a branch. Our revert changes will now be isolated from the replicating branch, main
.
Then we revert the bad commit using .
We use the function to generate the sql we want to run on our primary.
Let's concatentate all those SQL statements together. Dolt just yet so we need to use replace()
.