Dolt's storage engine is based on code and ideas in Noms. Noms is a "decentralized database philosophically descendant from Git."
Dolt's fork of Noms is included in Dolt's code, as this was easier to manage. Dolt's Noms fork has been heavily modified since. We thank the Noms team for laying the groundwork for Dolt and helping us envision what was possible.
Dolt implements tables on top of Noms as a map of primary keys to values.
The top level abstraction in Noms is a database. Noms databases can have zero or more datasets.
A dataset is a named pointer to the root value of one or more Prolly Trees, a content-addressed binary tree. Prolly trees can be compared in time proportional to the difference between the two trees, a necessary performance characteristic for version control operations.
The roots of the prolly trees are stored in a Merkle Directed Acyclic Graph (DAG). This structure allows Noms to share the storage of data across versions, a key to scalability.
How Dolt Works Blog Series
The best deep dive into how the Dolt storage engine works is a series of blog posts by Aaron Son.