# Commit Graph

Dolt's unique [storage engine](https://docs.dolthub.com/architecture/storage-engine) implements a Git-style commit graph of [Prolly Trees](https://docs.dolthub.com/architecture/storage-engine/prolly-tree). Dolt's commit graph facilitates common version control operations like log, diff, branch and merge on database tables instead of files.

![Dolt commit graph](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-34b0499170045fd39a779c0463b038da3c227ec7%2Fcommit-graph-featured.png?alt=media)

### Git vs Dolt

Git and Dolt share the same [version control conceptual underpinnings](https://docs.dolthub.com/concepts/dolt/git). In Git and Dolt, the commit graph concepts are the same. In Git and Dolt, the commands to modify the commit graph are the same. The only difference is what Git and Dolt version. Git versions files. Dolt versions tables. Thus, if you know how the Git commit graph works, you know how the Dolt commit graph works. If not, read on.

### What is a Commit?

A commit is a marker in your version history that stores all the relevant information for recreating that version. A commit is identified by a unique commit hash that looks something like `9shmcqu3q4o6ke8807pedlad2cfakvl7`.

A commit contains two sets of information: the content and the metadata.

In Git, the content is the set of files as they existed at that point in time, identified by a content address. In Dolt, the content is the set of tables in the database at that point in time, identified by a content address. In Dolt, content addresses are created using a novel data structure called a [Prolly Tree](https://docs.dolthub.com/architecture/storage-engine/prolly-tree), that allows for structural sharing, efficient diff, and fast querying of table data.

Additionally, commit metadata like author, date, and message are stored so it is easier to identify the commit you are looking for in the version history. This metadata is considered when creating the content address that you see in the commit log. So, even if two commits have the exact same content but are committed at different times or by different authors, they will have different commit hashes.

### Why put Commits in a Graph?

Putting Commits is a graph allows for a representation of history, branches, and merges; core concepts of version control. A branch allows for multiple evolving histories. A merge allows two disparate histories to be combined.

#### How to Build a Commit Graph

The easiest way to understand the commit graph is to build one. Let's build a simple commit graph from scratch.

#### The Init Commit

To create a commit graph you must "initialize" one. Initialization can be done with the `init` command via the command line. This creates an "init commit". In Dolt, `create database` also creates an init commit if you are in the SQL context.

The `init` command creates a commit with metadata taken from the environment and empty contents.

![init commit](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-3bdc60cb8f9649700ce95d7ce51fd5eeeef81ab6%2Fcommit-graph-init.png?alt=media)

The init commit is made on the default branch, usually named `main`. A branch is a pointer to a commit. The tip of a branch has a special name or reference called `HEAD`.

![init commit on main](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-37598135525058381dfbf7f83605c4e9d260e07b%2Fcommit-graph-main-branch.png?alt=media)

#### `WORKING` and `STAGED`

In Git and Dolt, at the `HEAD` of a branch there are two additional special references, called `STAGED` ad `WORKING`. These references point to active changes you are making to the `HEAD` of the branch. If there are no changes, the contents of `HEAD`, `STAGED`, and `WORKING` are the same.

![HEAD, STAGED, and WORKING](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-bc411ff93e173df55eeb2a728a792822c2daf615%2Fcommit-graph-staged-working.png?alt=media)

When you make changes to the content of a branch, changes are made in `WORKING`. Changes to the content of `WORKING` change its content address.

![WORKING changes](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-b594b9aacebc386b2b90c47023364b542fcf8412%2Fcommit-graph-working-changes.png?alt=media)

When you are ready to make a commit, you stage the changes using the `add` command. If you stage all your changes, `STAGED` and `WORKING` point to the same content and thus share the same content address.

![STAGED changes](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-8a76bdbda7d47d747537f20fdd9f4a4cb7533a48%2Fcommit-graph-staged-changes.png?alt=media)

`STAGED` and `WORKING` allow for changes to content to be tested and verified before being stored permanently in the commit graph.

`WORKING` is often called the working set. An interesting way to think about the working set is traditional file systems that don't use Git only have a working set. Traditional databases like MySQL or Postgres only have a working set. If you create a database in Dolt and only run traditional SQL, your working set will look and act exactly like a MySQL database.

#### History

Commits are created using the aptly named `commit` command. When you commit your `STAGED` changes, the content that is staged is moved to the tip of the branch and you have an opportunity to add metadata like message and author. The `HEAD` of the branch `main` becomes this newly created commit.

![A Second Commit](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-4f7dfb5e764d6390c62d6300b55ebcb0d9cb28b9%2Fcommit-graph-second-commit.png?alt=media)

Commits have zero to many parents. The init commit has zero parents. A normal commit has one parent, representing the previous commit metadata and content. A merge commit, which we'll discuss later, has many parents.

Parents allow for the history of branches to be computed by walking the branch from its `HEAD`. This is commonly called the commit log and generated using the `log` command. For instance, in our pictured example, using the `log` command on the `main` branch here would list commits `h512kl` and `t1ms3n`.

#### Branches

Up to this point, we are dealing only with linear history. If there is only one editor making serial changes, the commit graph will look like a long line of commits. A linear commit graph is still a graph, but not a very interesting graph.

Branches allow for non-linear history, a fork in the commit graph. Branches are often used to isolate multiple users' changes. Two users can make changes to content without worrying about what the other is changing. This capability is quite powerful. We've all worked on a shared document where people stomp on each other's changes. Branches prevent stomping.

Branches are created using the `branch` command. When branches are created the `HEAD` of the branch points at a specified commit, usually the `HEAD` commit of the branch you are currently using.

![New branch](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-065cd6d5be3df3135d136d11d6eac7daca4970ec%2Fcommit-graph-new-branch.png?alt=media)

Now, using the same process above we can make a commit on the branch. The `HEAD` of the new branch now points at this new commit.

![Commits on a Branch](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-6c81a5033511c81dbb6aae4f5a37fcad710eff0c%2Fcommit-graph-commit-on-branch.png?alt=media)

In parallel, we can make a commit on `main`.

![Commits on main](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-2a540e91117f967cead19ef00ebec9b67142bf33%2Fcommit-graph-commit-on-main.png?alt=media)

The two branches now contain different contents and share a common ancestor. As you can see, parallel, isolated evolving histories are now possible using branches.

#### Merges

Merges are performed using the `merge` command. Merges allow you to join separate histories that exist on branches. Merges create a commit with multiple parents.

Merges are performed by finding the common ancestor commit and applying the changes from other branches in the merge to the current branch. Merge functionality requires the ability to quickly find the differences between the contents of two commits.

![Merge Commit](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-00cac062bf38e1ac5a3313239586a29c6684c231%2Fcommit-graph-merge-commit.png?alt=media)

After merging, it is common to delete the branch that was merged signaling the change intended on the branch is complete.

![Merge branch deleted](https://1372377717-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MO3iUAEaFtCYE0joxfA%2Fuploads%2Fgit-blob-3aef788ab8fbdd91b1959cc31eb65043d72fa9a8%2Fcommit-graph-merge-branch-deleted.png?alt=media)

Merges can generate [conflicts](https://docs.dolthub.com/concepts/dolt/git/conflicts). If two branches modify the same value, Git and Dolt notify the user. The user has the opportunity to resolve the conflicts as part of the merge.

Merges allow for collaboration among multiple users. Usually prior to merge, changes are reviewed by observing the computed differences between the branch you are merging from and the branch you are merging to. If the changes pass review, the merge is executed.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dolthub.com/architecture/storage-engine/commit-graph.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
