This document will walk you through step-by-step on how to get Dolt running as a MySQL compatible server on your host. You will set up a schema, insert data, and compose read queries using SQL. The document will also cover a number of unique Git-like Dolt features like commits, logs, as of queries, rollback, branches, and merges.
Navigate to the directory where you would like your data stored
Dolt needs a place to store your databases. I'm going to put my databases in ~/dolt.
%cd~%mkdirdolt%cddolt
Any databases you create will be stored in this directory. So, for this example, a directory named getting_started will be created here later in this walkthrough, after you run create database getting_started; in a SQL shell (see section Create a schema). Navigating to ~/dolt/getting_started will then allow you to access this database using the Dolt command line.
Start a MySQL-compatible database server
Dolt ships with a MySQL compatible database server built in. To start it you use the command dolt sql-server. Running this command starts the server on port 3306.
Your terminal will just hang there. This means the server is running. Any errors will be printed in this terminal. Just leave it there and open a new terminal.
Connect with any MySQL client
In the new terminal, we will now connect to the running database server using a client.
Let's grab a copy of MySQL so we can connect with that client. Head over to the MySQL Getting Started documentation and install MySQL on your machine. I used Homebrew to install MySQL on my Mac.
MySQL comes with a MySQL server called mysqld and a MySQL client called mysql. You're only interested in the client. After following the instructions from MySQL's documentation, make sure you have a copy of the mysql client on your path:
Now, to connect the mysql client to Dolt, you are going to force the MySQL client through the TCP interface by passing in a host and port. The default is the socket interface which Dolt supports, but is only available on localhost. So, it's better to show off the TCP interface. The MySQL client also requires you specify a user, in this case root.
%mysql--host127.0.0.1--port3306-urootWelcometotheMySQLmonitor.Commandsendwith ; or \g.YourMySQLconnectionidis2Serverversion:5.7.9-VitessCopyright (c) 2000, 2022, Oracle and/or its affiliates.OracleisaregisteredtrademarkofOracleCorporationand/oritsaffiliates.Othernamesmaybetrademarksoftheirrespectiveowners.Type'help;'or'\h'forhelp.Type'\c'toclearthecurrentinputstatement.mysql>
To ensure the client actually connected, you should see the following in the dolt sql-server terminal
2022-06-06T13:26:55-07:00 INFO [conn 2] NewConnection {DisableClientMultiStatements=false}
As you can see, Dolt supports any MySQL-compatible client.
Create a schema
Now we're actually ready to do something interesting. I'll stay in the mysql client and execute the following SQL statements to create a database called getting_started. The getting_started database will have three tables: employees, teams, and employees_teams.
Dolt supports foreign keys, secondary indexes, triggers, check constraints, and stored procedures. It's a modern, feature-rich SQL database.
Make a Dolt commit
It's time to use your first Dolt feature. We're going to make a Dolt commit. A Dolt commit allows you to time travel and see lineage. Make a Dolt commit whenever you want to restore or compare to this point in time.
Dolt exposes version control functionality through a Git-style interface. On the command line, Dolt commands map exactly to their Git equivalent with the targets being tables instead of files. In SQL, Dolt exposes version control read operations as system tables and version control write operations as stored procedures.
The naming of the system tables and stored procedures follows the dolt_<command> pattern. So dolt add on the CLI becomes dolt_add as a stored procedure. Passing options also follows the command line model. For instance, to specify tables to add, send the table names in as options to the dolt_add procedure. For named arguments like sending a message into the dolt_commit command use two arguments in sequence like ('-m', 'This is a message'). If you know Git, the version control procedures and system tables should feel familiar.
There you have it. Your schema is created and you have a Dolt commit tracking the creation, as seen in the dolt_log system table.
Note, a Dolt commit is different than a standard SQL transaction COMMIT. In this case, I am running the database with AUTOCOMMIT on, so each SQL statement is automatically generating a transaction COMMIT. If you want a system to generate a Dolt commit for every transaction use the system variable,@@dolt_transaction_commit.
Insert some data
Now, I'm going to populate the database with a few employees here at DoltHub. Then, I'll assign the employees to two teams: engineering and sales. The CEO wears many hats at a start up so he'll be assigned to multiple teams.
mysql>insert into employees values (0, 'Sehn', 'Tim'), (1, 'Hendriks', 'Brian'), (2, 'Son','Aaron'), (3, 'Fitzgerald', 'Brian');Query OK, 4rows affected (0.01 sec)mysql>select*from employees where first_name='Brian';+------+------------+------------+| id | last_name | first_name |+------+------------+------------+| 1 | Hendriks | Brian || 3 | Fitzgerald | Brian |+------+------------+------------+2rowsinset (0.00 sec)mysql>insert into teams values (0, 'Engineering'), (1, 'Sales');Query OK, 2rows affected (0.00 sec)mysql>insert into employees_teams values (0,0), (1,0), (2,0), (0,1), (3,1);ERROR 1452 (HY000): cannot add or update a child row - Foreign key violation on fk: `rv9ek7ft`, table: `employees_teams`, referenced table: `teams`, key: `[2]`
Oops, I violated a constraint. It looks like I created the table with teams before employees. You should always specify your columns when you insert, not rely on natural ordering. Serves me right! Dolt comes with the full power of a modern SQL relational database to ensure data integrity.
mysql>insert into employees_teams(employee_id, team_id) values (0,0), (1,0), (2,0), (0,1), (3,1);Query OK, 5rows affected (0.01 sec)mysql>select first_name, last_name, team_name from employeesjoin employees_teams on (employees.id=employees_teams.employee_id)join teams on (teams.id=employees_teams.team_id)where team_name='Engineering';+------------+-----------+-------------+| first_name | last_name | team_name |+------------+-----------+-------------+| Tim | Sehn | Engineering || Brian | Hendriks | Engineering || Aaron | Son | Engineering |+------------+-----------+-------------+3rowsinset (0.00 sec)
Looks like everything is inserted and correct. I was able to list the members of the engineering team using that three table JOIN. Dolt supports up to twelve table JOINs. Again, Dolt is a modern SQL relational database paired with Git-style version control.
Examine the diff
Now, what if you want to see what changed in your working set before you make a commit? You use the dolt_status and dolt_diff_<tablename> system tables.
You can inspect the log using dolt_log and see which tables changed in each commit using an unscoped dolt_diff. Unscoped dolt_diff tells you whether schema, data, or both changed in that particular commit for the table.
mysql>select*from dolt_log;+----------------------------------+-----------+-----------------+-------------------------+----------------------------+
| commit_hash | committer | email | date | message |
+----------------------------------+-----------+-----------------+-------------------------+----------------------------+
| 13qfqa5rojq18j84d1n2htjkm6fletg4 | Tim Sehn | tim@dolthub.com | 2022-06-07 16:39:32.066 | Populated tables with data |
| ne182jemgrlm8jnjmoubfqsstlfi1s98 | Tim Sehn | tim@dolthub.com | 2022-06-07 16:35:49.277 | Created initial schema |
| vluuhvd0bn59598utedt77ed9q5okbcb | Tim Sehn | tim@dolthub.com | 2022-06-07 16:33:59.531 | Initialize data repository |
+----------------------------------+-----------+-----------------+-------------------------+----------------------------+
3rowsinset (0.00 sec)mysql>select*from dolt_diff;+----------------------------------+-----------------+-----------+-----------------+-------------------------+----------------------------+-------------+---------------+
| commit_hash | table_name | committer | email | date | message | data_change | schema_change |
+----------------------------------+-----------------+-----------+-----------------+-------------------------+----------------------------+-------------+---------------+
| 13qfqa5rojq18j84d1n2htjkm6fletg4 | teams | Tim Sehn | tim@dolthub.com | 2022-06-07 16:39:32.066 | Populated tables with data | 1 | 0 |
| 13qfqa5rojq18j84d1n2htjkm6fletg4 | employees | Tim Sehn | tim@dolthub.com | 2022-06-07 16:39:32.066 | Populated tables with data | 1 | 0 |
| 13qfqa5rojq18j84d1n2htjkm6fletg4 | employees_teams | Tim Sehn | tim@dolthub.com | 2022-06-07 16:39:32.066 | Populated tables with data | 1 | 0 |
| ne182jemgrlm8jnjmoubfqsstlfi1s98 | employees | Tim Sehn | tim@dolthub.com | 2022-06-07 16:35:49.277 | Created initial schema | 0 | 1 |
| ne182jemgrlm8jnjmoubfqsstlfi1s98 | employees_teams | Tim Sehn | tim@dolthub.com | 2022-06-07 16:35:49.277 | Created initial schema | 0 | 1 |
| ne182jemgrlm8jnjmoubfqsstlfi1s98 | teams | Tim Sehn | tim@dolthub.com | 2022-06-07 16:35:49.277 | Created initial schema | 0 | 1 |
+----------------------------------+-----------------+-----------+-----------------+-------------------------+----------------------------+-------------+---------------+
6rowsinset (0.00 sec)
Oh no! I made a mistake.
Dolt supports undoing changes via call dolt_reset(). Let's imagine I accidentally drop a table.
mysql>droptableemployees_teams;Query OK, 0rows affected (0.01 sec)mysql> show tables;+---------------------------+| Tables_in_getting_started |+---------------------------+| employees || teams |+---------------------------+2rowsinset (0.00 sec)
In a traditional database, this could be disastrous. In Dolt, you're one command away from getting your table back.
mysql>call dolt_reset('--hard');+--------+| status |+--------+| 0 |+--------+1rowinset (0.01 sec)mysql> show tables;+---------------------------+| Tables_in_getting_started |+---------------------------+| employees || employees_teams || teams |+---------------------------+3rowsinset (0.01 sec)
Dolt makes operating databases less error prone. You can always back out changes you have in progress or rewind to a known good state. You also have the ability to undo specific commits using dolt_revert().
Note, the only unrecoverable SQL statement in Dolt is drop database. This deletes the database and all of it's history on disk. drop database works this way for SQL tool compatibility as it is common for import tools to issue a drop database to clear all database state before an import. Dolt implements remotes like in Git so you can maintain an offline copy for backup using clone, fetch, push, and pull. Maintaining a remote copy allows you to restore in the case of an errant drop database query.
See the data in a SQL Workbench
Hate the command line? Let's use Tableplus to make some modifications. Tableplus is a free SQL Workbench. Follow the installation instructions from their website.
Now, to connect you must select MySQL as the connection type. Then enter a name for your connection, getting_started as your database, and root as your user.
Click connect and you'll be presented with a familiar database workbench GUI.
Make changes on a branch
To make changes on a branch, I use the dolt_checkout() stored procedure. Using the -b option creates a branch, just like in Git.
Tableplus gives me the ability to enter a multiple line SQL script on the SQL tab. I entered the following SQL to checkout a branch, update, insert, delete, and finally Dolt commit my changes.
call dolt_checkout('-b','modifications');update employees SET first_name='Timothy'where first_name='Tim';insert INTO employees (id, first_name, last_name) values (4,'Daylon', 'Wilkins');insert into employees_teams(team_id, employee_id) values (0,4);deletefrom employees_teams where employee_id=0and team_id=1;call dolt_commit('-am', 'Modifications on a branch')
Here's the result in Tableplus.
Back in my terminal, I cannot see the table modifications made in Tableplus because they happened on a different branch than the one I have checked out in my session.
mysql>select*from dolt_branches;+---------------+----------------------------------+------------------+------------------------+-------------------------+----------------------------+
| name | hash | latest_committer | latest_committer_email | latest_commit_date | latest_commit_message |
+---------------+----------------------------------+------------------+------------------------+-------------------------+----------------------------+
| main | 13qfqa5rojq18j84d1n2htjkm6fletg4 | Tim Sehn | tim@dolthub.com | 2022-06-07 16:39:32.066 | Populated tables with data |
| modifications | uhkv57j4bp2v16vcnmev9lshgkqq8ppb | Tim Sehn | tim@dolthub.com | 2022-06-07 16:41:49.847 | Modifications on a branch |
+---------------+----------------------------------+------------------+------------------------+-------------------------+----------------------------+
2rowsinset (0.00 sec)mysql>select active_branch();+-----------------+| active_branch() |+-----------------+| main |+-----------------+1rowinset (0.00 sec)mysql>select*from employees;+------+------------+------------+| id | last_name | first_name |+------+------------+------------+| 0 | Sehn | Tim || 1 | Hendriks | Brian || 2 | Son | Aaron || 3 | Fitzgerald | Brian |+------+------------+------------+4rowsinset (0.00 sec)
I can query the branch no matter what I have checked out using SQL as of syntax.
mysql>select*from employees as of 'modifications';+------+------------+------------+| id | last_name | first_name |+------+------------+------------+| 0 | Sehn | Timothy || 1 | Hendriks | Brian || 2 | Son | Aaron || 3 | Fitzgerald | Brian || 4 | Wilkins | Daylon |+------+------------+------------+5rowsinset (0.01 sec)
If I'd like to see the diff between the two branches, I can use the dolt_diff() table function. It takes two branches and the table name as arguments.
Changing schema on a branch gives you a new method for doing isolated integration testing of new schema changes.
Merge it all together
Let's assume all the testing of the new schema on the schema_changes branch and data on the modifications branch completed flawlessly. It's time to merge all our edits together onto main. This is done using the dolt_merge stored procedure.
Data changes successful as well. As you can see, I am now "Timothy" instead of "Tim", Daylon is added, and we all have start dates except for Daylon who was added on a different branch.
mysql>select first_name, last_name, team_name from employeesjoin employees_teams on (employees.id=employees_teams.employee_id)join teams on (teams.id=employees_teams.team_id)where team_name='Sales';+------------+------------+-----------+| first_name | last_name | team_name |+------------+------------+-----------+| Brian | Fitzgerald | Sales |+------------+------------+-----------+1rowinset (0.01 sec)
I'm also gone from the Sales Team. Engineering is life.
Now, we have a database with all the schema and data changes merged and ready for use.
mysql>select*from dolt_log;+----------------------------------+-----------+-----------------+-------------------------+----------------------------------------+
| commit_hash | committer | email | date | message |
+----------------------------------+-----------+-----------------+-------------------------+----------------------------------------+
| vn9b0qcematsj2f6ka0hfoflhr5s6p0b | Tim Sehn | tim@dolthub.com | 2022-06-07 17:10:02.07 | Merge branch 'modifications' into main |
| pg3nfi0j1dpc5pf1rfgckpmlteaufdrt | Tim Sehn | tim@dolthub.com | 2022-06-07 16:44:37.513 | Added start_date column to employees |
| uhkv57j4bp2v16vcnmev9lshgkqq8ppb | Tim Sehn | tim@dolthub.com | 2022-06-07 16:41:49.847 | Modifications on a branch |
| 13qfqa5rojq18j84d1n2htjkm6fletg4 | Tim Sehn | tim@dolthub.com | 2022-06-07 16:39:32.066 | Populated tables with data |
| ne182jemgrlm8jnjmoubfqsstlfi1s98 | Tim Sehn | tim@dolthub.com | 2022-06-07 16:35:49.277 | Created initial schema |
| vluuhvd0bn59598utedt77ed9q5okbcb | Tim Sehn | tim@dolthub.com | 2022-06-07 16:33:59.531 | Initialize data repository |
+----------------------------------+-----------+-----------------+-------------------------+----------------------------------------+
6rowsinset (0.00 sec)
Audit Cell Lineage
Which commit changed my first name? With Dolt you have lineage for every cell in your database. Let's use the dolt_history_<tablename> and dolt_diff_<tablename> to explore the lineage features in Dolt.
dolt_history_<tablename> shows you the state of the row at every commit.
mysql>select*from dolt_history_employees where id=0order by commit_date;+------+-----------+------------+------------+----------------------------------+-----------+-------------------------+| id | last_name | first_name | start_date | commit_hash | committer | commit_date |+------+-----------+------------+------------+----------------------------------+-----------+-------------------------+| 0 | Sehn | Tim | NULL | 13qfqa5rojq18j84d1n2htjkm6fletg4 | Tim Sehn | 2022-06-0716:39:32.066 || 0 | Sehn | Timothy | NULL | uhkv57j4bp2v16vcnmev9lshgkqq8ppb | Tim Sehn | 2022-06-0716:41:49.847 || 0 | Sehn | Tim | 2018-09-08 | pg3nfi0j1dpc5pf1rfgckpmlteaufdrt | Tim Sehn | 2022-06-0716:44:37.513 || 0 | Sehn | Timothy | 2018-09-08 | vn9b0qcematsj2f6ka0hfoflhr5s6p0b | Tim Sehn | 2022-06-0717:10:02.07 |+------+-----------+------------+------------+----------------------------------+-----------+-------------------------+4rowsinset (0.00 sec)
dolt_diff_<tablename> allows you to filter the history down to only commits when the cell in question changed. In this case, I'm interested in the commits that are changing my first name. Note, there are two commits that changed my name because one is the original change and the second is the merge commit.
mysql>select to_commit,from_first_name,to_first_name from dolt_diff_employeeswhere (from_id=0or to_id=0) and (from_first_name <> to_first_name or from_first_name isNULL)order by to_commit_date;+----------------------------------+-----------------+---------------+| to_commit | from_first_name | to_first_name |+----------------------------------+-----------------+---------------+| 13qfqa5rojq18j84d1n2htjkm6fletg4 | NULL | Tim || uhkv57j4bp2v16vcnmev9lshgkqq8ppb | Tim | Timothy || vn9b0qcematsj2f6ka0hfoflhr5s6p0b | Tim | Timothy |+----------------------------------+-----------------+---------------+3rowsinset (0.01 sec)
Dolt provides powerful data audit capabilities down to individual cells. When, how, and why has each cell in your database changed over time?
Conclusion
That should be enough to get you started. We covered installation, starting a SQL server, connecting with various clients, creating a database and schema, inserting and updating data on main, using branches for change isolation, rollback, diffs and logs, merge, and cell lineage. You had the grand tour. Hopefully you are starting to imagine the possibilities for your Dolt-backed applications.
Want to dive even deeper? Here are some links to advanced topics: