24,000 Contact Us Cloud

Data Compaction in TDengine TSDB

Arun Arulraj

January 29, 2026 /

TDengine is designed to support a wide range of write patterns, from continuous high-frequency ingestion to irregular batch writes. While this flexibility is essential for real-world time-series workloads, certain write scenarios can introduce storage-side inefficiencies over time, such as out-of-order data and gaps caused by deletions or updates.

Such inefficiencies not only increase storage usage but can also degrade query performance. To address this, TDengine TSDB-Enterprise provides a data compaction capability that restructures on-disk data to improve both storage layout and query efficiency. This is performed through the COMPACT operation.

What COMPACT Does

The COMPACT operation reorganizes and defragments stored data files by removing invalid or obsolete data and optimizing file structure. Specifically, a compaction task performs the following actions:

  • Scans and compresses data files across all vnodes in the specified database or vgroup
  • Removes data that has been deleted, including data belonging to dropped tables
  • Merges multiple STT files to reduce fragmentation and improve locality

By consolidating data and eliminating file gaps, compaction helps reduce storage overhead and improve query performance, especially for range scans and historical queries.

Compaction Scope and Control

TDengine allows fine-grained control over which data is compacted. You can limit compaction to a specific time range using START WITH and END WITH clauses or target specific vgroups when needed. Note that file groups that have not received new data since the last compaction are skipped, unless explicitly forced.

In addition, you can compact metadata compaction by using the META_ONLY option. Metadata is not compacted by default. Note that metadata compaction may block writes, and writes and queries to the database must be stopped whie metadata is being compacted.

Syntax Overview

You can compact data using the following SQL statement:

COMPACT DATABASE db_name [START WITH 'XXXX'] [END WITH 'YYYY'] [META_ONLY] [FORCE];

To compact data for certain vgroups only, use the following statement:

COMPACT [db_name.]VGROUPS IN (vgroup_id1, vgroup_id2, ...) [START WITH 'XXXX'] [END WITH 'YYYY'] [META_ONLY] [FORCE];

Each COMPACT statement returns a compaction task ID, which can be used to monitor or terminate the task.

Asynchronous Execution Model

Compaction tasks in TDengine are executed asynchronously. When a COMPACT statement is executed, it returns immediately without waiting for the operation to complete. However, if a previous compaction task is still running, the new request will wait until the prior task finishes before returning.

You can manage compaction tasks using the following statements:

  • SHOW COMPACTS;
    Display all compaction tasks.
  • SHOW COMPACT compact_id;
    Display details about a specific compaction task.
  • KILL COMPACT compact_id;
    Forcefully terminate an ongoing compaction task.

Operational Considerations

While compaction is designed to minimize disruption, there are important operational considerations:

  • Compaction does not block queries, but it may block writes, particularly in databases where stt_trigger = 1
  • Metadata compaction requires stricter control and must be scheduled during maintenance windows.

As a result, compaction is best planned during periods of lower write activity, especially for large databases with sustained ingestion workloads.

Automatic Data Compaction

In addition to manual compaction, TDengine TSDB-Enterprise supports automatic data compaction through database-level parameters. These parameters allow compaction tasks to be scheduled and executed automatically, reducing operational overhead in long-running deployments.

Automatic compaction is controlled by three key parameters: COMPACT_INTERVAL, COMPACT_TIME_RANGE, and COMPACT_TIME_OFFSET. Together, they define when compaction runs and which time range of data is compacted.

  • COMPACT_INTERVAL specifies how often automatic compaction is triggered. The interval is calculated based on fixed time slices starting from 1970-01-01T00:00:00Z. By default, this value is set to 0, which disables automatic compaction. When set to a non-zero value, TDengine TSDB periodically issues compaction tasks at the configured interval. If a previous compaction task is still running, no new task will be triggered, preventing overlapping operations.
  • COMPACT_TIME_RANGE defines the historical data window that each automatic compaction task processes. This allows compaction to focus on older data rather than recently written time ranges. If this parameter is not specified, TDengine automatically compacts all data.
  • COMPACT_TIME_OFFSET controls the execution time of automatic compaction relative to local time. By setting this offset, operators can ensure that compaction runs during predictable, low-traffic periods, such as late at night or during maintenance windows.

For example, the following statement creates a database that is automatically compacted daily at 2 a.m.:

CREATE DATABASE test KEEP 365 COMPACT_INTERVAL 1d COMPACT_TIME_RANGE 0,0 COMPACT_TIME_OFFSET 2h;

Summary

The COMPACT feature in TDengine TSDB-Enterprise provides a practical and efficient way to reorganize time-series data on disk, addressing fragmentation and storage inefficiencies introduced by real-world write patterns. By reorganizing and defragmenting data files and removing deleted data, compaction improves both storage efficiency and query performance, making it an important maintenance tool for long-running TDengine deployments.

  • Arun Arulraj

    Pursuing a Master’s Degree in Computer Science from the Georgia Institute of Technology and holding dual Bachelor’s degrees in Computer Science and Chemistry, Arun brings expertise in artificial intelligence, machine learning, and industrial data solutions to drive TDengine’s solution engineering efforts. Prior to joining TDengine, he worked as a Software Engineer at C3 AI and Meta, and served as Head of AI at Soundromeda, where he led the development of advanced AI-driven applications. He is currently based in California, USA.