LogoLogo
DoltHubBlogDiscordGitHubDolt
  • Introduction
    • What Is Dolt?
    • Installation
      • Linux
      • Windows
      • Mac
      • Build from Source
      • Application Server
      • Docker
      • Upgrading
    • Getting Started
      • Version Controlled Database
      • Git For Data
      • Versioned MySQL Replica
    • Use Cases
      • Data Sharing
      • Data and Model Quality Control
      • Manual Data Curation
      • Version Control for your Application
      • Versioned MySQL Replica
      • Audit
      • Configuration Management
      • Offline First
  • Concepts
    • Dolt
      • Git
        • Commits
        • Log
        • Diff
        • Branch
        • Merge
        • Conflicts
        • Remotes
        • Working Set
      • SQL
        • Databases
        • Schema
        • Tables
        • Primary Keys
        • Types
        • Indexes
        • Views
        • Constraints
        • Triggers
        • Procedures
        • Users/Grants
        • Transactions
        • System Variables
      • RDBMS
        • Server
        • Backups
        • Replication
    • DoltHub/DoltLab
      • Permissions
      • Pull Requests
      • Issues
      • Forks
  • SQL Reference
    • Running the Server
      • Configuration
      • Access Management
      • Branch Permissions
      • Backups
      • Garbage Collection
      • Metrics
      • Replication
      • Troubleshooting
    • Version Control Features
      • Using Branches
      • Merges
      • Querying History
      • Using Remotes
      • Procedures
      • Functions
      • System Tables
      • System Variables
      • Saved Queries
    • SQL Language Support
      • Data Description
      • Expressions, Functions, Operators
      • Supported Statements
      • MySQL Information Schema
      • Collations and Character Sets
      • System Variables
      • Miscellaneous
    • Supported Clients
      • Programmatic
      • SQL Editors
    • Benchmarks and Metrics
      • Correctness
      • Latency
      • Import
  • CLI Reference
    • Commands
    • Git Comparison
  • Architecture
    • Overview
    • Storage Engine
      • Commit Graph
      • Prolly Trees
      • Block Store
    • SQL
      • Go MySQL Server
      • Vitess
  • Guides
    • Cheat Sheet
    • Contributing
      • dolt
      • go-mysql-server
    • MySQL to Dolt Replication
    • Importing Data
    • Integrations
  • Other
    • FAQ
    • Roadmap
    • Versioning
  • Products
    • Hosted Dolt
      • Getting Started
      • Notable Features
      • SQL Workbench
      • Cloning a Hosted Database
      • Using DoltHub as a Remote
      • Infrastructure
    • DoltHub
      • Data Sharing
      • API
        • Authentication
        • SQL
        • CSV
        • Database
        • Hooks
      • Continuous Integration
        • Getting Started
        • Workflow Reference
      • Transform File Uploads
      • Workspaces
    • DoltLab
    • Dolt Workbench
    • DoltgreSQL
Powered by GitBook
On this page
  • Misc features
  • Client Compatibility
  • Join hints
  • Table Statistics
  • ANALYZE table
  • Disable
  • Auto-Refresh
  • Stats Garbage Collection
  • Stats Controller Functions

Was this helpful?

Edit on GitHub
Export as PDF
  1. SQL Reference
  2. SQL Language Support

Miscellaneous

PreviousSystem VariablesNextSupported Clients

Last updated 1 month ago

Was this helpful?

Misc features

Component
Supported
Notes and limitations

Information schema

✅

Views

✅

Window functions

🟠

Common table expressions (CTEs)

✅

Stored procedures

🟠

Cursors

✅

Triggers

✅

Client Compatibility

Some MySQL features are client features, not server features. Dolt ships with a client (ie. ) and a server (). The Dolt client is not as sophisticated as the mysql client. To access these features you can use the mysql client that ships with MySQL.

Feature
Supported
Notes and limitations

SOURCE

❌

Works with Dolt via the mysql client

LOAD DATA LOCAL INFILE

❌

LOAD DATA INFILE works with the Dolt client. The LOCAL option only works with Dolt via the mysql client

Join hints

Dolt supports the following join hints:

name
supported
detail

JOIN_ORDER(,...)

✅

Join tree in scope should use the following join execution order. Must include all table names.

LOOKUP_JOIN(,)

✅

Use LOOKUP strategy joining two tables.

MERGE_JOIN(,)

✅

Use MERGE strategy joining two tables.

HASH_JOIN(,)

✅

Use HASH strategy joining two tables.

INNER_JOIN(,)

✅

Use INNER strategy joining two tables.

SEMI_JOIN(,)

✅

Use SEMI strategy joining two tables (for EXISTS or IN queries).

ANTI_JOIN(,)

✅

Use ANTI strategy joining two tables (for NOT EXISTS or NOT IN queries).

JOIN_FIXED_ORDER

❌

Join tree uses in-place table order for execution.

NO_ICP

❌

Disable indexed range scans on index using filters.

Join hints are indicated immediately after a SELECT token in a special comment format /*+ */. Multiple hints should be separated by spaces:

SELECT /*+ JOIN_ORDER(arg1,arg2) */ 1
SELECT /*+ JOIN_ORDER(arg1,arg2) NO_ICP */ 1

Join hints currently require a full set of valid hints for all to be applied. For example, if we have a three table join we can enforce JOIN_ORDER on its own, join strategies on their own, or both order and strategy:

SELECT /*+ JOIN_ORDER(xy,uv,ab) LOOKUP_JOIN(xy,uv) HASH_JOIN(uv,ab) */ 1
FROM xy
JOIN uv on x = u
JOIN ab on a = u;

Additional notes:

  • If one hint is invalid given the execution options, no hints are applied and the engine falls back to default costing.

  • Join operator hints are order-insensitive

  • Join operator hints apply as long as the indicated tables are subsets of the join left/right.

Table Statistics

ANALYZE table

Dolt currently supports table statistics for index and join costing.

Statistics are auto-collected by default for servers, but cab be manually collected by running ANALYZE TABLE <table, ...>.

Here is an example of how to initialize and observe statistics:

CREATE TABLE xy (x int primary key, y int);
INSERT INTO xy values (1,1), (2,2);
ANALYZE TABLE xy;
SELECT * from information_schema.tables;
+-------------+------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| SCHEMA_NAME | TABLE_NAME | COLUMN_NAME | HISTOGRAM                                                                                                                                                                                                                                                                                                                                                      |
+-------------+------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tmp4        | xy         | x           | {"statistic": {"avg_size": 0, "buckets": [{"bound_count": 1, "distinct_count": 2, "mcv_counts": [1,1], "mcvs": [[1],[2]], "null_count": 0, "row_count": 2, "upper_bound": [2]}], "columns": ["x"], "created_at": "2023-11-14T11:33:32.250178-08:00", "distinct_count": 2, "null_count": 2, "qualifier": "tmp4.xy.PRIMARY", "row_count": 2, "types:": ["int"]}} |
+-------------+------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
create table horses (id int primary key, name varchar(10), key(name));
insert into horses select x, 'Steve' from (with recursive inputs(x) as (select 1 union select x+1 from inputs where x < 1000) select * from inputs) dt;
analyze table horses;
select `index`, `position`, row_count, distinct_count, columns, upper_bound, upper_bound_cnt, mcv1 from dolt_statistics;
+---------+----------+-----------+----------------+----------+-------------+-----------------+-----------+
| index   | position | row_count | distinct_count | columns  | upper_bound | upper_bound_cnt | mcv1      |
+---------+----------+-----------+----------------+----------+-------------+-----------------+-----------+
| primary | 0        | 344       | 344            | ["id"]   | [344]       | 1               | [344]     |
| primary | 1        | 125       | 125            | ["id"]   | [469]       | 1               | [469]     |
| primary | 2        | 249       | 249            | ["id"]   | [718]       | 1               | [718]     |
| primary | 3        | 112       | 112            | ["id"]   | [830]       | 1               | [830]     |
| primary | 4        | 170       | 170            | ["id"]   | [1000]      | 1               | [1000]    |
| name    | 5        | 260       | 1              | ["name"] | ["Steve"]   | 260             | ["Steve"] |
| name    | 6        | 237       | 1              | ["name"] | ["Steve"]   | 237             | ["Steve"] |
| name    | 7        | 137       | 1              | ["name"] | ["Steve"]   | 137             | ["Steve"] |
| name    | 8        | 188       | 1              | ["name"] | ["Steve"]   | 188             | ["Steve"] |
| name    | 9        | 178       | 1              | ["name"] | ["Steve"]   | 178             | ["Steve"] |
+---------+----------+-----------+----------------+----------+-------------+-----------------+-----------+

Disable

Some workloads, like batch imports, perform strictly better without the overhead of statistics collection. In these cases, we can explicitly stop or purge (stop + delete) statistics on a running server:

call dolt_stats_stop();
call dolt_stats_purge();

A stopped-stats server can be restarted, or have a single collection cycle performed by an operator:

call dolt_stats_starts();
call dolt_stats_once();

An environment variable can disable statistics on server reboots:

— on version 1.51.0 or higher
SET @@PERSIST.dolt_stats_enabled = 0;

— up to 1.50.x
SET @@PERSIST.dolt_stats_auto_refresh_enabled = 0;

A rebooted server with stats turned off has no reversal mechanism at the moment. All stats operations are no-ops if a server starts with the above variables set.

Auto-Refresh

Statistics automatically update for servers by default. Stats are stored in a database in .dolt/stats separate from user data. This folder can safely be deleted offline.

Stats throughput can be lowered by raising the the dolt_stats_job_interval variable, which indicates the milliseconds of delay between processing steps. The higher the delay and more branches in a database, the longer it will take for statistic updates to materialize. High delays reduce the fraction of runtime resources diverted to managing background statistics.

Stats can be disabled with the dolt_stats_enabled=0 variable.

Stats persistence can be disabled with the dolt_stats_memory_only=1 variable.

Stats Garbage Collection

The stats in-memory cache accumulates new histograms proportionally to the write rate and stats update rate. Periodically, an update cycle will swap the currently active histogram buckets to a new in-memory map and clear the old set.

Stats garbage collection can be disabled with the dolt_stats_gc_enabled=0 variable.

Garbage collection frequency can be tuned with the dolt_stats_gc_interval variable (default 1 hour).

Stats Controller Functions

Dolt exposes a set of helper procedures for managing statistics collection and use:

  • dolt_stats_stop: clear queue and disable thread

  • dolt_stats_restart: clear queue, refresh queue, start thread

  • dolt_stats_purge: clear queue, refresh queue, clear cache disable thread

  • dolt_stats_once: collect statistics once, ex: in sql-shell

  • dolt_stats_wait: block on a full queue cycle

  • dolt_stats_gc: block waiting for a GC signal

  • dolt_stats_flush: block waiting for a flush signal

  • dolt_stats_info: print the current state of the stats provider (optional '-short' flag)

Some functions not supported, see

Only a few statements are not yet supported, see

compound statements
dolt sql
dolt sql-server
window function docs