LogoLogo
DoltHubBlogDiscordGitHubDolt
  • Introduction
    • What Is Dolt?
    • Installation
      • Linux
      • Windows
      • Mac
      • Build from Source
      • Application Server
      • Docker
      • Upgrading
    • Getting Started
      • Version Controlled Database
      • Git For Data
      • Versioned MySQL Replica
    • Use Cases
      • Data Sharing
      • Data and Model Quality Control
      • Manual Data Curation
      • Version Control for your Application
      • Versioned MySQL Replica
      • Audit
      • Configuration Management
      • Offline First
  • Concepts
    • Dolt
      • Git
        • Commits
        • Log
        • Diff
        • Branch
        • Merge
        • Conflicts
        • Remotes
        • Working Set
      • SQL
        • Databases
        • Schema
        • Tables
        • Primary Keys
        • Types
        • Indexes
        • Views
        • Constraints
        • Triggers
        • Procedures
        • Users/Grants
        • Transactions
        • System Variables
      • RDBMS
        • Server
        • Backups
        • Replication
    • DoltHub/DoltLab
      • Permissions
      • Pull Requests
      • Issues
      • Forks
  • SQL Reference
    • Running the Server
      • Configuration
      • Access Management
      • Branch Permissions
      • Backups
      • Garbage Collection
      • Metrics
      • Replication
      • Troubleshooting
    • Version Control Features
      • Using Branches
      • Merges
      • Querying History
      • Using Remotes
      • Procedures
      • Functions
      • System Tables
      • System Variables
      • Saved Queries
    • SQL Language Support
      • Data Description
      • Expressions, Functions, Operators
      • Supported Statements
      • MySQL Information Schema
      • Collations and Character Sets
      • System Variables
      • Miscellaneous
    • Supported Clients
      • Programmatic
      • SQL Editors
    • Benchmarks and Metrics
      • Correctness
      • Latency
      • Import
  • CLI Reference
    • Commands
    • Git Comparison
  • Architecture
    • Overview
    • Storage Engine
      • Commit Graph
      • Prolly Trees
      • Block Store
    • SQL
      • Go MySQL Server
      • Vitess
  • Guides
    • Cheat Sheet
    • Contributing
      • dolt
      • go-mysql-server
    • MySQL to Dolt Replication
    • Importing Data
    • Integrations
  • Other
    • FAQ
    • Roadmap
    • Versioning
  • Products
    • Hosted Dolt
      • Getting Started
      • Notable Features
      • SQL Workbench
      • Cloning a Hosted Database
      • Using DoltHub as a Remote
      • Infrastructure
    • DoltHub
      • Data Sharing
      • API
        • Authentication
        • SQL
        • CSV
        • Database
        • Hooks
      • Continuous Integration
        • Getting Started
        • Workflow Reference
      • Transform File Uploads
      • Workspaces
    • DoltLab
    • Dolt Workbench
    • DoltgreSQL
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
Export as PDF
  1. Concepts

Dolt

PreviousOffline FirstNextGit

Last updated 1 year ago

Was this helpful?

Dolt brings the features of Git-style distributed version control to the SQL database.

Git-style Distributed Version Control allowed the world to collaborate on open source software in a beautiful way. Dolt aspires to bring that distributed collaboration model to data.

SQL is the worldwide standard for data description and querying. SQL has been popular for 50 years. By combining schema and data, SQL gives data a powerful language for data practitioners to communicate with.

Before Dolt, to share a SQL database with a fellow data practitioner, you both needed to share the same view of the data. Only one write could happen at a time. Making a copy implied creating a point in time backup and restoring on a separate running server. Once that copy was made, the two databases could change independently. There was no tractable way to compare the two copies of the database to see what changed. Moreover, there was no easy way to merge the two copies back together. In source code parlance, the copy was a hard fork of the database.

The inability to copy and merge forced databases into a specific model of usage. Data was hard to move and share. As an industry, we built complicated pipelines to move and transform data between databases. We built APIs to allow programmatic, controlled access to data.

Here at DoltHub, we looked at all these systems and thought there must be a better way. What if you could copy a database, make changes, compare the database to any other copy, and merge the changes whenever you wanted? What if thousands of people could do this at the same time? What if you could use Git workflows on databases?

A database with these properties would allow thousands of users to read and write at the same time. If someone made a mistake, no big deal, just roll back the change. Need a copy of the data to run a metrics job on? No problem, just make a clone. Bug in production? Create a copy of the database on your laptop, start your services, change the production data to speed debugging. Want to open your data up to the world? Push it up to a remote that's accessible via the internet.

Concepts

In order to achieve the above mission, Dolt needed to implement Git concepts in a SQL database. As best we could, we tried to keep things as similar as possible.

We built Dolt using the following axioms:

  1. Git versions files. Dolt versions table schema and table data.

  2. Dolt will copy the Git command line exactly.

  3. Dolt will be MySQL compatible.

  4. Git features in SQL will extend MySQL SQL. Write operations will be procedures. Read operations will be system tables.

In order to achieve the above at scale, we needed to start at the bottom; the storage engine of the database. to offer you the Git experience in a SQL database.

In this section of the documentation, we will explain , , and concepts and how we applied them in Dolt using the above axioms.

Dolt is built from the storage engine up
Git
SQL
Relational Database Management System (RDBMS)