Backups
There are several ways to safely backup Dolt databases and Dolt SQL servers. If you are using Hosted Dolt, then you get automatic backups without having to configure anything. If you are running your own Dolt SQL server, then you need to handle your own backups using one of the approaches below.
Backing up through point-in-time snapshots at a block device level is often the easiest approach and what we recommended if this works for your setup. Backing up by copying files at a file system level can also work in some cases, but requires that no Dolt processes are reading or writing any data while the file copy operation is in progress. You can also roll your own custom solutions by pushing to remotes or using the Dolt backup command. Make sure you include all additional configuration files needed to fully restore your Dolt SQL server environment. As with any backup solution, it is important that you regularly test your backup and restore processes.
Point-in-Time Snapshots on Block Devices
If you are running a Dolt SQL server with the data directory (i.e. the directory where your databases are stored) on a block device that supports point-in-time snapshots, such as AWS Elastic Block Store (EBS), then you can take advantage of these snapshots for backing up your databases and configuration. In most cases, this will handle the versioned content of your databases as well as all additional configuration files. Because of how point-in-time snapshots work, this operation is consistent and safe to perform, even when the Dolt SQL server is running. Because of its simplicity and safety, this is our recommended approach for backing up your Dolt SQL server.
Copying Files on File Systems
Unlike block devices that support point-in-time snapshots, you cannot rely on the same safety and consistency guarantees at the file system layer. This is true whether you are using a local file system or a file system service such as AWS Elastic File System. You cannot rely on copying the contents of a file system for a safe backup. It is possible that the file system copy operation will not capture a consistent set of files and you risk being unable to restore the backup. However, if you are able to fully stop the Dolt SQL server and ensure no Dolt processes are using the file system data, then you can safely copy the files at the file system level for a backup. This is only safe if no Dolt processes are using the data, so you should use extra caution to ensure that Dolt is not running if you want to pursue this approach. Otherwise, if you are not able to stop the Dolt SQL Server to perform a backup, you should rely on one of the backup approaches described below.
Pushing to Remotes
Using remotes for backups is suitable for some use cases, but be aware that using remotes for backups only backs up to the current commit of a branch, not the working set or other branches. Pushing to a remote creates an off server copy of the branch being pushed. Frequently pushing to a remote can serve as a backup for some use cases.
Configure a remote
This example uses DoltHub as a remote, but you can use Dolt with other remotes like filesystem, AWS S3, and GCS. I created an empty database on DoltHub and configured the appropriate read and write credentials on this host.
Backup by Pushing a Branch
Using DoltHub or DoltLab as a remote provides a web UI to your backups.
dolt backup
Command
dolt backup
CommandDolt also has backups, accessed with the dolt backup
command via the CLI or the dolt_backup()
stored procedure via SQL. These backups look more like traditional database backups. The entire state of the database, including uncommitted changes on all branches, are copied to another location.
Create a backup
To create a backup you first configure a backup using syntax similar to the remote syntax.
You can use any valid Dolt remote, including AWS S3, GCS, and DoltHub remotes. For example, to backup to a DoltHub remote:
The repository username/my-database-backup
will need to have already been created on DoltHub and your Dolt CLI will need to be authenticated to write to it.
It's important to note that syncing a backup to a remote will overwrite every existing branch and working set. The state of the remote repository becomes exactly the state of the database which is synced to it. In particular, remote branches will be deleted if they do not exist in the local database and all existing branches are pushed without regard to merged state, whether the push is a fast-forward of an existing remote branch, etc. If you want to keep more than one version of a backup, you should use different remote URLs.
Syncing a backup to an existing backup remote is incremental. It only copies the novel data which has changed since the last sync.
Sync a backup from SQL
You can use the dolt_backup
SQL procedure to sync backups from within SQL.
Restore from a backup
The backup command called with the restore option looks a lot like the dolt clone command.
Note the working set changes were also restored.
Backups are currently configured on a per-database basis. In a sql-server context with multiple databases, unique backup remotes should be configured for each database and each database should be synced individually.
Additional SQL Server Configuration
Using the Dolt backup
command or pushing to a remote will copy the contents and versioned history of a single database. To ensure you have everything needed to recreate your complete Dolt SQL server environment, you should also capture the following configuration and include it with your database backups.
SQL Server Configuration File – You should make sure you have a backed up copy of the YAML configuration file you use to configure your Dolt SQL server. If you keep your configuration in source control (e.g. to use it to automate your Dolt SQL server environment with Terraform) and never make any manual edits to your configuration file outside of source control, then you may choose to skip this file.
Global Dolt Configuration Directory – Dolt reads instance global configuration from the
.dolt
directory in the current user'sHOME
directory, or optionally, from the directory set with theDOLT_ROOT_PATH
environment variable. Global configuration can contain values such as SQL variables that persist between SQL server restarts, and credentials.Users and Grants Database – Access control information, such as users and grants, lives in the system
mysql
database (the same as a traditional MySQL database server). To backup this information, we recommend using themysqldump
tool to dump the contents of themysql
database (e.g.mysqldump mysql --flush-privileges --insert-ignore -uroot > dump.sql
). That dump file can then be piped into a running Dolt SQL server to execute each SQL statement and recreate all the users and grants from the previous environment.Per-Database Configuration File – Each Dolt database can have local configuration settings applied through the
dolt config
command. These settings are stored in the.dolt/config.json
file inside each database directory and should be included in backups if any configuration values are applied.Per-Database State File – Each Dolt database has a file stored at
.dolt/repo_state.json
inside each database directory that contains additional metadata for a database, such as the configured remotes.Per-Database Branch Permission Tables – The
dolt_branch_control
anddolt_branch_namespace
tables enable SQL server administrators to control how users can interact with branches in a running SQL server. These system tables will not be automatically included in backups when pushing to a remote or usingdolt backup
and need to be backed up if you are using these features. We recommend usingmysqldump
to dump the contents of these two tables (e.g.mysqldump -uroot <database name> dolt_branch_control dolt_branch_namespace
).
Last updated