DuckDB Data Accelerator
To use DuckDB as Data Accelerator, specify duckdb as the engine for acceleration.
datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
Configuration​
The DuckDB accelerator can be configured by providing the following params:
duckdb_file: The name for the file to back the DuckDB database. If the file does not exist, it will be created. Only applies ifmodeisfile.
Configuration params are provided in the acceleration section for a data store. Other common acceleration fields can be configured for DuckDB, see see datasets.
datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
mode: file
params:
duckdb_file: /my/chosen/location/duckdb.db
Limitations
- The DuckDB accelerator does not support enum, dictionary, or map field types. For example:
- Unsupported:
SELECT MAP(['key1', 'key2', 'key3'], [10, 20, 30])
- Unsupported:
- The DuckDB accelerator does not support
Decimal256(76 digits), as it exceeds DuckDB's maximum Decimal width of 38 digits. - Using the DuckDB accelerator with
on_zero_results: use_sourcedoes not support using filters on binary columns when the query results in using the source connection, likeWHERE col_blob <> ''. Cast the binary to another data type instead, likeWHERE CAST(col_blob AS TEXT) <> ''. - Updating a dataset with DuckDB acceleration while the Spice Runtime is running (hot-reload) will cause the DuckDB accelerator query federation to disable until the Runtime is restarted.
Memory Considerations
When accelerating a dataset using mode: memory (the default), some or all of the dataset is loaded into memory. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
In-memory limitations can be mitigated by storing acceleration data on disk, which is supported by duckdb and sqlite accelerators by specifying mode: file.
Cookbook​
- A cookbook recipe to configure DuckDB as a data accelerator in Spice. DuckDB Data Accelerator
