LogoLogo
DoltHubBlogDiscordGitHubDolt
  • Introduction
    • What Is Dolt?
    • Installation
      • Linux
      • Windows
      • Mac
      • Build from Source
      • Application Server
      • Docker
      • Upgrading
    • Getting Started
      • Version Controlled Database
      • Git For Data
      • Versioned MySQL Replica
    • Use Cases
      • Data Sharing
      • Data and Model Quality Control
      • Manual Data Curation
      • Version Control for your Application
      • Versioned MySQL Replica
      • Audit
      • Configuration Management
      • Offline First
  • Concepts
    • Dolt
      • Git
        • Commits
        • Log
        • Diff
        • Branch
        • Merge
        • Conflicts
        • Remotes
        • Working Set
      • SQL
        • Databases
        • Schema
        • Tables
        • Primary Keys
        • Types
        • Indexes
        • Views
        • Constraints
        • Triggers
        • Procedures
        • Users/Grants
        • Transactions
        • System Variables
      • RDBMS
        • Server
        • Backups
        • Replication
    • DoltHub/DoltLab
      • Permissions
      • Pull Requests
      • Issues
      • Forks
  • SQL Reference
    • Running the Server
      • Configuration
      • Access Management
      • Branch Permissions
      • Backups
      • Garbage Collection
      • Metrics
      • Replication
      • Troubleshooting
    • Version Control Features
      • Using Branches
      • Merges
      • Querying History
      • Using Remotes
      • Procedures
      • Functions
      • System Tables
      • System Variables
      • Saved Queries
    • SQL Language Support
      • Data Description
      • Expressions, Functions, Operators
      • Supported Statements
      • MySQL Information Schema
      • Collations and Character Sets
      • System Variables
      • Miscellaneous
    • Supported Clients
      • Programmatic
      • SQL Editors
    • Benchmarks and Metrics
      • Correctness
      • Latency
      • Import
  • CLI Reference
    • Commands
    • Git Comparison
  • Architecture
    • Overview
    • Storage Engine
      • Commit Graph
      • Prolly Trees
      • Block Store
    • SQL
      • Go MySQL Server
      • Vitess
  • Guides
    • Cheat Sheet
    • Contributing
      • dolt
      • go-mysql-server
    • MySQL to Dolt Replication
    • Importing Data
    • Integrations
  • Other
    • FAQ
    • Roadmap
    • Versioning
  • Products
    • Hosted Dolt
      • Getting Started
      • Notable Features
      • SQL Workbench
      • Cloning a Hosted Database
      • Using DoltHub as a Remote
      • Infrastructure
    • DoltHub
      • Data Sharing
      • API
        • Authentication
        • SQL
        • CSV
        • Database
        • Hooks
      • Continuous Integration
        • Getting Started
        • Workflow Reference
      • Transform File Uploads
      • Workspaces
    • DoltLab
    • Dolt Workbench
    • DoltgreSQL
Powered by GitBook
On this page
  • Problem
  • Dolt solves this by…
  • Dolt replaces...
  • Exchanging Files
  • External APIs
  • Companies Doing This
  • Case Studies
  • Other Related Articles

Was this helpful?

Edit on GitHub
Export as PDF
  1. Introduction
  2. Use Cases

Data Sharing

PreviousUse CasesNextData and Model Quality Control

Last updated 1 year ago

Was this helpful?

Problem

  • Do you share data with customers?

  • Do they ask you what changed between versions you share?

  • Do they want to actively switch versions instead of having data change out from under them?

  • Or, are customers or vendors sharing data with you?

  • Are you having trouble maintaining quality of scraped data?

  • When new data is shared or scraped, do downstream systems break?

  • Would you like to see exactly what changed between data versions?

  • Do you want to add automated testing to data shared with you?

  • Would you like to instantly rollback to the previous version if tests fail?

Dolt solves this by…

Dolt was built for sharing. The Git model of code sharing has scaled to thousands of contributors for open source software. We believe the same model can work for data.

Dolt is the world's first version controlled SQL database. Git-style version control allows for decentralized, asynchronous collaboration. Every person gets their own copy of the database to read and write. DoltHub allows you to coordinate collaboration over the internet with , , and all the other distributed collaboration tools you are used to from GitHub.

Dolt and DoltHub is the best way to share data with customers. Use versions to satisfy both slow and fast upgrading consumers. Let your customers help make your data better. Versions offer better debugging information. Version X works but version Y doesn't. Your customers can even make changes and submit data patches for your review, much like open source.

Dolt and DoltHub are also great if vendors share data with you. When you receive data from a vendor, import the data into Dolt. Examine the diff, either with the human eye or programmatically, before putting the data into production. You can now build integration tests for vendor data. If there's a problem, never the import into main or roll the change back if a bug was discovered in production. Use the problematic to debug with your vendor. The same tools you have for software dependencies, you now have for data dependencies.

Dolt replaces...

Exchanging Files

External APIs

Companies Doing This

Case Studies

Let us know if you would like us to feature your use of Dolt for data sharing here.

Other Related Articles

Dolt replaces exchanging flat data files like CSVs via email, FTP servers, or other file transfer techniques. Dolt allows data to maintain schema on exchange including constraints, triggers, and views. This more rich format of exchange reduces transfer errors. Dolt also allows you to change the data to fit your needs and still get updates from your source. Dolt will notify you if your changes with the source.

Dolt is ideal for sharing data that does not have an API. But even for data with an API, Dolt is often more convenient. With Dolt, you get all the data and its history. With APIs you often have to assemble the data with multiple API calls. With APIs, the data can change out from under you, whereas with Dolt you can read a version of the data until you are ready to upgrade. DoltHub ships with a so you can choose the data sharing solution that is right for your use case.

permissions
human review
forks
merge
branch
diff
conflict
SQL API
Bitfinex
KAPSARC
Distribute Data with Dolt, not APIs
Data Collaboration on DoltHub
DoltHub is the Figma of Databases