TDengine vs. InfluxDB 3.0: Compression Performance

Joel Brass

April 24, 2025 /

1. Executive Summary

This benchmark study demonstrates that TDengine achieves compression ratios 2.3x to 25.8x superior to InfluxDB across different real-world time-series datasets. These findings are supported by open-source test code available for independent verification.

2. Methodology

To ensure complete transparency and reproducibility, all test code and procedures are publicly available on GitHub. The testing scenarios are taken from the independent and open-source TSBS framework.

2.1. Time Series Benchmark Suite (TSBS)

Time Series Benchmark Suite (TSBS) is an open-source performance testing platform for time-series data. Originally developed by InfluxData and now maintained by Timescale, the TSBS framework includes data generation and ingestion, query processing, and automated result aggregation for IoT and DevOps use cases. It has been used by a number of database providers, including InfluxData, Timescale, QuestDB, ClickHouse, VictoriaMetrics, and Redis as a benchmarking tool for performance testing.

TSBS currently includes two use cases, one simulating CPU monitoring in a data center (referred to as the DevOps use case) and another simulating fleet management for a logistics enterprise (referred to as the IoT use case). These use cases are described in detail in the following sections. In this report, both use cases in the TSBS framework were used to assess the performance of TDengine and InfluxDB in an objective, accurate, and verifiable manner.

2.2. Test Scenarios

TSBS does not define standard test scenarios but allows the user to generate desired scenarios by inputting the use case, pseudo-random number generator (PRNG) seed, number of devices, time range of test data, interval between data points, and database system. TSBS generates test data randomly but in a deterministic manner such that inputting the same seed will generate the same set of data each time. The scenarios used in this report follow Timescale with the exception that the time ranges have been adjusted.

Scenario 1Scenario 2Scenario 3Scenario 4
Devices1004,000100,0001 million
Duration2 days2 days3 hours3 minutes
Interval10 seconds10 seconds10 seconds10 seconds
Rows per device (IoT)15,54915,55897216
Total rows (IoT)3,109,944124,466,978194,487,99732,414,619
Rows per device (DevOps)17,28017,2801,08018
Total rows (DevOps)1,728,00069,120,000108,000,00018,000,000

2.3. IoT Use Case

The IoT use case simulates the data generated by a group of trucks operated by a logistics company. The diagnostics data for these trucks includes one nanosecond-level timestamp, three metrics, and eight tags. The readings data for the trucks includes one nanosecond-level timestamp, seven metrics, and eight tags. The generated datasets may include out-of-order or missing data, intended to simulate scenarios in which trucks may be offline for some time.

A sample data record is described in the following figures.

Sample diagnostics data point in the IoT use case
Sample readings data point in the IoT use case

The metrics in these tables are randomly generated within the following ranges:

  • fuel_state: floating-point number between 0 and 1.0
  • current_load: floating-point number between 0 and 5000.0
  • status: integer 0 or 1
  • latitude: floating-point number between –90.0 and 90.0
  • longitude: floating-point number between –180.0 and 180.0
  • elevation: floating-point number between 0 and 5000.0
  • velocity: floating-point number between 0 and 100
  • heading: floating-point number between 0 and 360.0
  • grade: floating-point number between 0 and 100.0
  • fuel_consumption: floating-point number between 0 and 50

2.4. DevOps Use Case

This use case simulates the data generated by CPU monitoring, recording 10 metrics and 10 tags per CPU with a nanosecond-precision timestamp. The generated datasets do not include null or out-of-order data.

A sample data record is described in the following figure.

Sample readings data point in the IoT use case

The metrics in this table are all randomly generated floating-point numbers ranging from 0 to 100.

3. Test Environment

All tests described in this report were run on servers with the following specifications located in Amazon Web Services (AWS):

  • CPU: Intel® Xeon® CPU E5-2650 v3 @ 2.30GHz (40 cores)
  • Memory: 251 GB of DDR4 synchronous registered (buffered) RAM at 2133 MT/s
  • Operating system: Ubuntu 22.04 LTS

The following versions of TDengine and InfluxDB were tested:

  • TDengine OSS 3.3.6.3, gitinfo b6a63a76f552b4afb467eb970043471ffa8acfda
  • InfluxDB Core 3.0.0, revision 3b602eead2bb27aee74fb3cfc45f6be806d3b836

3.1. Configuring TDengine

The TDengine server was configured with six vgroups. The default values were retained for all other parameters.

For the TSBS IoT dataset used in this evaluation, one supertable was created for readings and another for diagnostics. Then one subtable was created in each supertable for each vehicle. The value of the name tag for each truck is also used as the name of the subtable, with the prefix d for the diagnostics supertable and r for the readings supertable.

For the DevOps CPU-only dataset used in this evaluation, one supertable was created for all CPUs. A subtable was then created for each CPU. The value of the hostname tag for each CPU is also used as the name of the subtable.

3.2. Configuring InfluxDB

The InfluxDB server was started as follows:

influxdb3 serve --node-id=local01 --object-store=file --data-dir /data/influx --http-bind=0.0.0.0:8081

This specifies that Parquet files are stored on the filesystem instead of in memory. The default values were retained for all other parameters.

Data was then generated using the standard settings in the TSBS framework:

tsbs_generate_data --use-case="iot" --seed=123 --scale=4000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T01:00:00Z" --log-interval="10s" --format="influx" > /data/influx/influxdb_iot.out

3.3 Assessing Compression Performance

Considering that database management systems store data in different ways, the same raw data may occupy different amounts of disk space in different databases even without compression. Therefore the raw size of the datasets in this report was calculated by taking the sum of the maximum sizes of each item in a row and then multiplying this size per row by the number of rows in the dataset.

  • In the IoT dataset, each row in the diagnostics table occupies 206 bytes and each row in the readings table occupies 238 bytes for a total of 444 bytes.
  • In the DevOps dataset, each row occupies 388 bytes.
  • The number of rows in each scenario is described in section 2.2 above.

The compressed size of a dataset was determined by using command-line tools to obtain the size on disk of the data directory for each database management system after the ingestion and compression processes had finished. The write-ahead log (WAL) was excluded from this calculation for both TDengine and InfluxDB.

The compression ratio was then determined by dividing the calculated raw data size by the compressed size. For example, a raw data size of 100 MB and corresponding compressed data size of 1 MB would result in a compression ratio of 10:1.

4. Data Model

4.1. IoT Diagnostics Table Schema in TDengine

4.2. IoT Diagnostics Table Schema in InfluxDB

4.3. IoT Readings Table Schema in TDengine

4.4. IoT Readings Table Schema in InfluxDB

4.5. DevOps Table Schema in TDengine

4.6. DevOps Table Schema in InfluxDB

5. Test Results

5.1. Disk Space Usage

Disk space occupied for IoT use case scenarios (lower is better)
InfluxDB CoreTDengine OSSInfluxDB vs. TDengine
100 devices349 MB47 MB742.55%
4,000 devices7424 MB1846 MB402.17%
100,000 devices15929 MB3146 MB506.33%
1 million devices3318 MB1423 MB233.17%
Disk space occupied for DevOps use case scenarios (lower is better)
InfluxDB CoreTDengine OSSInfluxDB vs. TDengine
100 devices194 MB8 MB2425.00%
4,000 devices7909 MB306 MB2584.64%
100,000 devices7591 MB720 MB1054.31%
1 million devices1858 MB706 MB263.17%

TDengine required less disk space to store the TSBS datasets in all scenarios and use cases. Its compression performance ranged from 2.3 times to 25.8 times better than InfluxDB with significantly higher efficiency in the 100,000 devices and smaller-scale categories.

5.2. Compression Ratio

Raw DataInfluxDBTDengine
IoT Scenario 11,381 MB3.96:129.38:1
IoT Scenario 255,263 MB7.44:129.94:1
IoT Scenario 386,353 MB5.42:127.45:1
IoT Scenario 414,392 MB4.34:110.11:1
DevOps Scenario 1670 MB3.46:183.81:1
DevOps Scenario 226,819 MB 3.39:187.64:1
DevOps Scenario 341,904 MB5.52:158.20:1
DevOps Scenario 46,984 MB3.76:19.89:1

InfluxDB achieved compression ratios from 3.39:1 to 7.44:1 while TDengine’s compression performance ranges from 9.89:1 to 87.64:1. TDengine was especially effective at compressing the integer metrics in the DevOps scenario.

5.3. Resource Consumption

CPU usage during ingestion and compression of the IoT dataset in Scenario 3
Memory usage during ingestion and compression of the IoT dataset in Scenario 3
CPU usage during ingestion and compression of the DevOps dataset in Scenario 3
Memory usage during ingestion and compression of the DevOps dataset in Scenario 3
  • During ingestion and compression, InfluxDB used between 15% and 20% of CPU resources and 12 GB to 23 GB of memory.
  • CPU and memory resources were mostly used at a consistent rate throughout the ingestion and compression period.
  • In both use cases, InfluxDB experienced a spike to over 40% CPU and 29 GB of memory when beginning to process the ingested data.
  • TDengine had higher average usage at 40% CPU and 42 GB of memory in the IoT use case and 28% CPU and 34 GB of memory in the DevOps use case.
  • TDengine’s total resource usage was significantly lower because all ingestion and processing was completed within five minutes, whereas InfluxDB took almost 30 minutes to ingest and process the same dataset.

6. Analysis

With version 3.0, InfluxDB uses the Apache Parquet file format for storing data. Parquet includes a range of built-in encoding and compression options, but these are not configurable through InfluxDB. The specific encoding and compression algorithms used by InfluxDB for each column in this test are therefore not known.

TDengine’s encoding and compression options are configurable on a per-column basis. The default values are determined based on the data type of the column and have been optimized to provide the best compression performance for that data type. In this test, the default values have been used for all columns.

It is possible that TDengine’s superior compression performance in this test is due to more optimal default values for encoding and compression algorithms or more optimal implementation of those algorithms, but the algorithms available are similar in TDengine and Parquet.

TDengine’s “one table per device” design likely played a larger role in improving compression performance. In this design, one table is created for each device, ensuring that each block of data contains the records for a single table. This ensures that similar data is stored together and can greatly increase compressibility in many time-series scenarios where adjacent values differ only by a small amount.

The drawback of this model is that when datasets contain a small amount of data from a large number of devices, there is significant storage overhead caused by table creation. This is reflected in the results for Scenario 4, including 1 million devices with fewer than 20 records each, which had lower compression performance compared with the scenarios including a larger number of records per device.

7. Reproducing These Results

We encourage you to verify these results and have developed a script with which you can run TSBS tests on your own machine. On an Ubuntu 22 machine, clone our TSBS fork to the /usr/local/src directory. Then open the scripts/tsdbComp directory and run the tsbs_test.sh --help command as the root user.

sudo -s
git clone https://github.com/taosdata/tsbs
cd tsbs
git checkout enh/add-influxdb3.0
cd scripts/tsdbComp
./tsbs_test.sh --help

This describes the scenarios that you can test and the configuration options available to you.

Note that performance testing by nature requires machines with adequate hardware.

  • If you would like to run the full test suite for the DevOps or IoT use case, use a server with at least 24 cores, 128 GB of RAM, and 500 GB of disk space.
  • If you prefer to run the tests on a personal computer or smaller virtual machine, select the cputest or iottest scenarios. These scenarios run a subset of TSBS that can return results within 45 minutes on most computers. For these scenarios, a machine with 4 cores, 8 GB of RAM, and 40 GB of disk space is required.

8. Business Impact Analysis

The compression ratio advantage translates directly to business value:

8.1. Storage Cost Savings

Although storage costs have decreased in recent years, TDengine’s high compression can still significantly reduce expenses, especially at scale. Consider Scenario 4 from the IoT use case described above. Over a month, a dataset of this size would generate over 48 TB of data as compressed by InfluxDB, compared with only 20.8 TB in TDengine. Assuming typical enterprise storage costs of $0.10/GB/month for hot storage, including backup and maintenance, this would result in a yearly reduction of $33,200 in storage costs.

8.2. Beyond Storage: Additional Benefits

While this study focuses solely on compression ratio, TDengine’s storage efficiency delivers additional benefits:

  1. I/O Performance Improvement: Reduced data volumes mean fewer I/O operations for the same logical data.
  2. Backup and Disaster Recovery Efficiency: A smaller data footprint reduces backup time and transfer costs.
  3. Network Transfer Reduction: For distributed systems and edge-cloud synchronization architectures, transferring compressed data can significantly reduce bandwidth requirements.

9. Conclusion

This benchmark study conclusively demonstrates TDengine’s superior compression ratio compared to InfluxDB across multiple real-world scenarios. With better compression ratios in all tested datasets, TDengine offers substantial advantages in storage efficiency, cost savings, and overall database performance.

The methodology employed in this study emphasizes transparency, reproducibility, and real-world applicability. By making all test code, data, and procedures publicly available, we invite independent verification of these results.

For organizations dealing with large volumes of time-series data, particularly those with high-cardinality requirements or hybrid edge-cloud deployments, TDengine’s compression advantage represents a significant technological and economic benefit.

  • Joel Brass
    Joel Brass

    Joel Brass is a Solutions Architect at TDengine, bringing extensive experience in real-time data processing, time-series analytics, and full-stack development. With a 20 year background in software engineering and a deep focus on scalable applications and solutions, Joel has worked on a range of projects spanning joke databases, IoT, self-driving vehicles, and work management platforms. Prior to joining TDengine, Joel worked in Advisory Services for Enterprise customers of Atlassian and the Systems Engineering team at Waymo. He is currently based in the San Francisco Bay Area.