Diff

What is a Diff?

Diff, short for difference, is used to display the differences between two references, usually commits. Dolt produces diffs of schema and data.
Dolt produces diffs of table schema. Schema diffs appear as raw textual differences between the CREATE TABLE statements used to define the table schema at each commit.
Dolt produces cell-wise diffs between table data. If a primary key exists, rows are identified across commits via primary key. Non-primary key changes will appear as updates. Primary key changes will appear as inserts and corresponding deletes.
If no primary key exists, all changes look like inserts and deletes. Effectively, for diff purposes, the keys of the table with no primary keys are the entire form.
Dolt can produce diffs in command line readable form, table form, or as a SQL patch.
Dolt can produce diffs at scale because the Dolt storage engine breaks the rows in the database down into chunks. Each chunk is content-addressed and stored in a tree called a Prolly Tree. Thus, to calculate data diffs, Dolt walks the trees at both commits, exposing the chunks that are different. For instance, if nothing has changed, the content address of the root of the table is unchanged.

How to use diffs

Diffs are an invaluable tool for data debugging.
In human readable form, seeing what cells in your database changed can help you instantly spot problems in the data that may have gone overlooked. You can see diffs in human readable form via the Dolt CLI or through a SQL query of the dolt_diff_<tablename> system table.
For instance, are you expecting no NULL cells but have some? This indicates a bug in your data creation process. Simply looking at a summary of how many rows were added, modified, and deleted in a specific change can be fruitful. Expecting only row additions in a change but got some modifications? A deeper dive into that import job may be required.
Programmatically, you can use SQL to explore very large diffs using the dolt_diff_<tablename> system tables.

Difference between Git diffs and Dolt diffs

Git and Dolt diffs are conceptually the same. Display the differences between two sets of files in Git's case and tables in Dolt's case.
The Git diff command supports many more file specific options. Dolt diffs can be queried using SQL Dolt diffs produce diffs for schema and data. There is no schema diff equivalent in Git.

Example

Schema

1
docs $ dolt sql -q "alter table docs add column c1 int"
2
docs $ dolt diff
3
diff --dolt a/docs b/docs
4
--- a/docs @ 90tss7r2gfraa2cjugganbbtg5j6kjfc
5
+++ b/docs @ nt808mhhienne2dss4mjdcj8jrdig6ml
6
CREATE TABLE `docs` (
7
`pk` int NOT NULL,
8
+ `c1` int,
9
PRIMARY KEY (`pk`)
10
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
11
+-----+----+----+
12
| < | pk | |
13
| > | pk | c1 |
14
+-----+----+----+
15
+-----+----+----+
Copied!

Data with Primary Key

Addition/deletion

1
docs $ dolt sql -q "insert into docs values (1,0),(2,1)"
2
Query OK, 2 rows affected
3
docs $ dolt diff
4
diff --dolt a/docs b/docs
5
--- a/docs @ cj9ln2kg7u6aiprr7i24rf48es4tk1eg
6
+++ b/docs @ qfiv5iankh0gltpov71h0mcvaqvftvlh
7
+-----+----+----+
8
| | pk | c1 |
9
+-----+----+----+
10
| + | 1 | 0 |
11
| + | 2 | 1 |
12
+-----+----+----+
13
docs $ dolt sql -q "delete from docs where pk=0"
14
Query OK, 1 row affected
15
docs $ dolt diff
16
diff --dolt a/docs b/docs
17
--- a/docs @ cj9ln2kg7u6aiprr7i24rf48es4tk1eg
18
+++ b/docs @ 8aca41vfss9kcqkrdhos25be87nlu3b9
19
+-----+----+----+
20
| | pk | c1 |
21
+-----+----+----+
22
| - | 0 | 0 |
23
| + | 1 | 0 |
24
| + | 2 | 1 |
25
+-----+----+----+
Copied!

Update

1
docs $ dolt sql -q "update docs set c1=1 where pk=1"
2
Query OK, 1 row affected
3
Rows matched: 1 Changed: 1 Warnings: 0
4
docs $ dolt diff
5
diff --dolt a/docs b/docs
6
--- a/docs @ 8aca41vfss9kcqkrdhos25be87nlu3b9
7
+++ b/docs @ 2lcu9e49ia08icjonmt3l0s7ph2cdb5s
8
+-----+----+----+
9
| | pk | c1 |
10
+-----+----+----+
11
| < | 1 | 0 |
12
| > | 1 | 1 |
13
+-----+----+----+
Copied!

Data without primary key

Addition/Deletion

1
docs $ dolt sql -q "insert into no_pk values (0,0,0),(1,1,1),(2,2,2)"
2
Query OK, 3 rows affected
3
docs $ dolt diff
4
diff --dolt a/no_pk b/no_pk
5
--- a/no_pk @ df9bd3mf77t2gicphep87nvuobqjood7
6
+++ b/no_pk @ 7s8jhc9nlnouhai8kdtsssrm1hpegpf0
7
+-----+----+----+----+
8
| | c1 | c2 | c3 |
9
+-----+----+----+----+
10
| + | 1 | 1 | 1 |
11
| + | 2 | 2 | 2 |
12
| + | 0 | 0 | 0 |
13
+-----+----+----+----+
14
docs $ dolt commit -am "Added data to no_pk table"
15
commit mjbtf27jidi86jrm32lvop7mmlpgplbg
16
Author: Tim Sehn <[email protected]>
17
Date: Mon Dec 06 13:38:19 -0800 2021
18
19
Added data to no_pk table
20
21
docs $ dolt sql -q "delete from no_pk where c1=0"
22
Query OK, 1 row affected
23
docs $ dolt diff
24
diff --dolt a/no_pk b/no_pk
25
--- a/no_pk @ 7s8jhc9nlnouhai8kdtsssrm1hpegpf0
26
+++ b/no_pk @ s23c851fomfcjaiufm25mi9mlnhurh2c
27
+-----+----+----+----+
28
| | c1 | c2 | c3 |
29
+-----+----+----+----+
30
| - | 0 | 0 | 0 |
31
+-----+----+----+----+
Copied!

Update

1
docs $ dolt sql -q "update no_pk set c1=0 where c1=1"
2
Query OK, 1 row affected
3
Rows matched: 1 Changed: 1 Warnings: 0
4
docs $ dolt diff
5
diff --dolt a/no_pk b/no_pk
6
--- a/no_pk @ s23c851fomfcjaiufm25mi9mlnhurh2c
7
+++ b/no_pk @ 18k2q3pav9a2v8mkk26nhhhss4eda86k
8
+-----+----+----+----+
9
| | c1 | c2 | c3 |
10
+-----+----+----+----+
11
| - | 1 | 1 | 1 |
12
| + | 0 | 1 | 1 |
13
+-----+----+----+----+
Copied!

SQL

1
docs $ dolt sql -q "select * from dolt_diff_docs"
2
+-------+-------+----------------------------------+-----------------------------------+---------+---------+----------------------------------+-----------------------------------+-----------+
3
| to_c1 | to_pk | to_commit | to_commit_date | from_c1 | from_pk | from_commit | from_commit_date | diff_type |
4
+-------+-------+----------------------------------+-----------------------------------+---------+---------+----------------------------------+-----------------------------------+-----------+
5
| 0 | 0 | dmmkbaiq6g6mm0vruc07utpns47sjkv7 | 2021-12-06 21:31:54.041 +0000 UTC | NULL | NULL | v42og53ru3k3hak3decm23crp5p6kd2f | 2021-12-06 21:27:53.886 +0000 UTC | added |
6
| 1 | 1 | 5emu36fgedeurr6qk5uq6mj96k5j53j9 | 2021-12-06 21:36:02.076 +0000 UTC | 0 | 1 | ne14m8g2trlunju5a2mu735kjioocmll | 2021-12-06 21:34:12.585 +0000 UTC | modified |
7
| NULL | NULL | ne14m8g2trlunju5a2mu735kjioocmll | 2021-12-06 21:34:12.585 +0000 UTC | 0 | 0 | dmmkbaiq6g6mm0vruc07utpns47sjkv7 | 2021-12-06 21:31:54.041 +0000 UTC | removed |
8
| 0 | 1 | ne14m8g2trlunju5a2mu735kjioocmll | 2021-12-06 21:34:12.585 +0000 UTC | NULL | NULL | dmmkbaiq6g6mm0vruc07utpns47sjkv7 | 2021-12-06 21:31:54.041 +0000 UTC | added |
9
| 1 | 2 | ne14m8g2trlunju5a2mu735kjioocmll | 2021-12-06 21:34:12.585 +0000 UTC | NULL | NULL | dmmkbaiq6g6mm0vruc07utpns47sjkv7 | 2021-12-06 21:31:54.041 +0000 UTC | added |
10
+-------+-------+----------------------------------+-----------------------------------+---------+---------+----------------------------------+-----------------------------------+-----------+
11
docs $ dolt sql -q "select * from dolt_diff_docs where from_pk=0"
12
+-------+-------+----------------------------------+-----------------------------------+---------+---------+----------------------------------+-----------------------------------+-----------+
13
| to_c1 | to_pk | to_commit | to_commit_date | from_c1 | from_pk | from_commit | from_commit_date | diff_type |
14
+-------+-------+----------------------------------+-----------------------------------+---------+---------+----------------------------------+-----------------------------------+-----------+
15
| NULL | NULL | ne14m8g2trlunju5a2mu735kjioocmll | 2021-12-06 21:34:12.585 +0000 UTC | 0 | 0 | dmmkbaiq6g6mm0vruc07utpns47sjkv7 | 2021-12-06 21:31:54.041 +0000 UTC | removed |
16
+-------+-------+----------------------------------+-----------------------------------+---------+---------+----------------------------------+-----------------------------------+-----------+
Copied!
Last modified 1mo ago