TDengine vs. InfluxDB 3.0: Compression Performance

TDengine Team

April 24, 2025 / TDengine vs. InfluxDB

1. Executive Summary

This benchmark study demonstrates that TDengine achieves compression ratios 2.3x to 25.8x superior to InfluxDB across different real-world time-series datasets. These findings are supported by open-source test code available for independent verification.

2. Methodology

To ensure complete transparency and reproducibility, all test code and procedures are publicly available on GitHub. The testing scenarios are taken from the independent and open-source TSBS framework.

2.1. Time Series Benchmark Suite (TSBS)

Time Series Benchmark Suite (TSBS) is an open-source performance testing platform for time-series data. Originally developed by InfluxData and now maintained by Timescale, the TSBS framework includes data generation and ingestion, query processing, and automated result aggregation for IoT and DevOps use cases. It has been used by a number of database providers, including InfluxData, Timescale, QuestDB, ClickHouse, VictoriaMetrics, and Redis as a benchmarking tool for performance testing.

TSBS currently includes two use cases, one simulating CPU monitoring in a data center (referred to as the DevOps use case) and another simulating fleet management for a logistics enterprise (referred to as the IoT use case). These use cases are described in detail in the following sections. In this report, both use cases in the TSBS framework were used to assess the performance of TDengine and InfluxDB in an objective, accurate, and verifiable manner.

2.2. Test Scenarios

TSBS does not define standard test scenarios but allows the user to generate desired scenarios by inputting the use case, pseudo-random number generator (PRNG) seed, number of devices, time range of test data, interval between data points, and database system. TSBS generates test data randomly but in a deterministic manner such that inputting the same seed will generate the same set of data each time. The scenarios used in this report follow Timescale with the exception that the time ranges have been adjusted.

	Scenario 1	Scenario 2	Scenario 3	Scenario 4
Devices	100	4,000	100,000	1 million
Duration	2 days	2 days	3 hours	3 minutes
Interval	10 seconds	10 seconds	10 seconds	10 seconds
Rows per device (IoT)	15,549	15,558	972	16
Total rows (IoT)	3,109,944	124,466,978	194,487,997	32,414,619
Rows per device (DevOps)	17,280	17,280	1,080	18
Total rows (DevOps)	1,728,000	69,120,000	108,000,000	18,000,000

2.3. IoT Use Case

The IoT use case simulates the data generated by a group of trucks operated by a logistics company. The diagnostics data for these trucks includes one nanosecond-level timestamp, three metrics, and eight tags. The readings data for the trucks includes one nanosecond-level timestamp, seven metrics, and eight tags. The generated datasets may include out-of-order or missing data, intended to simulate scenarios in which trucks may be offline for some time.

A sample data record is described in the following figures.

iot-dp-1 — Sample diagnostics data point in the IoT use case

iot-dp-2 — Sample readings data point in the IoT use case

The metrics in these tables are randomly generated within the following ranges:

fuel_state: floating-point number between 0 and 1.0
current_load: floating-point number between 0 and 5000.0
status: integer 0 or 1
latitude: floating-point number between –90.0 and 90.0
longitude: floating-point number between –180.0 and 180.0
elevation: floating-point number between 0 and 5000.0
velocity: floating-point number between 0 and 100
heading: floating-point number between 0 and 360.0
grade: floating-point number between 0 and 100.0
fuel_consumption: floating-point number between 0 and 50

2.4. DevOps Use Case

This use case simulates the data generated by CPU monitoring, recording 10 metrics and 10 tags per CPU with a nanosecond-precision timestamp. The generated datasets do not include null or out-of-order data.

A sample data record is described in the following figure.

The metrics in this table are all randomly generated floating-point numbers ranging from 0 to 100.

3. Test Environment

All tests described in this report were run on servers with the following specifications located in Amazon Web Services (AWS):

CPU: Intel® Xeon® CPU E5-2650 v3 @ 2.30GHz (40 cores)
Memory: 251 GB of DDR4 synchronous registered (buffered) RAM at 2133 MT/s
Operating system: Ubuntu 22.04 LTS

The following versions of TDengine and InfluxDB were tested:

TDengine OSS 3.3.6.3, gitinfo b6a63a76f552b4afb467eb970043471ffa8acfda
InfluxDB Core 3.0.0, revision 3b602eead2bb27aee74fb3cfc45f6be806d3b836

3.1. Configuring TDengine

The TDengine server was configured with six vgroups. The default values were retained for all other parameters.

For the TSBS IoT dataset used in this evaluation, one supertable was created for readings and another for diagnostics. Then one subtable was created in each supertable for each vehicle. The value of the name tag for each truck is also used as the name of the subtable, with the prefix d for the diagnostics supertable and r for the readings supertable.

For the DevOps CPU-only dataset used in this evaluation, one supertable was created for all CPUs. A subtable was then created for each CPU. The value of the hostname tag for each CPU is also used as the name of the subtable.

3.2. Configuring InfluxDB

The InfluxDB server was started as follows:

influxdb3 serve --node-id=local01 --object-store=file --data-dir /data/influx --http-bind=0.0.0.0:8081

This specifies that Parquet files are stored on the filesystem instead of in memory. The default values were retained for all other parameters.

Data was then generated using the standard settings in the TSBS framework:

tsbs_generate_data --use-case="iot" --seed=123 --scale=4000 --timestamp-start="2016-01-01T00:00:00Z" --timestamp-end="2016-01-01T01:00:00Z" --log-interval="10s" --format="influx" > /data/influx/influxdb_iot.out

3.3 Assessing Compression Performance

Considering that database management systems store data in different ways, the same raw data may occupy different amounts of disk space in different databases even without compression. Therefore the raw size of the datasets in this report was calculated by taking the sum of the maximum sizes of each item in a row and then multiplying this size per row by the number of rows in the dataset.

In the IoT dataset, each row in the diagnostics table occupies 206 bytes and each row in the readings table occupies 238 bytes for a total of 444 bytes.
In the DevOps dataset, each row occupies 388 bytes.
The number of rows in each scenario is described in section 2.2 above.

The compressed size of a dataset was determined by using command-line tools to obtain the size on disk of the data directory for each database management system after the ingestion and compression processes had finished. The write-ahead log (WAL) was excluded from this calculation for both TDengine and InfluxDB.

The compression ratio was then determined by dividing the calculated raw data size by the compressed size. For example, a raw data size of 100 MB and corresponding compressed data size of 1 MB would result in a compression ratio of 10:1.

4. Data Model

4.1. IoT Diagnostics Table Schema in TDengine

4.2. IoT Diagnostics Table Schema in InfluxDB

4.3. IoT Readings Table Schema in TDengine

4.4. IoT Readings Table Schema in InfluxDB

4.5. DevOps Table Schema in TDengine

4.6. DevOps Table Schema in InfluxDB

5. Test Results

5.1. Disk Space Usage

iot-comp-latest — Disk space occupied for IoT use case scenarios (lower is better)

	InfluxDB Core	TDengine OSS	InfluxDB vs. TDengine
100 devices	349 MB	47 MB	742.55%
4,000 devices	7424 MB	1846 MB	402.17%
100,000 devices	15929 MB	3146 MB	506.33%
1 million devices	3318 MB	1423 MB	233.17%

devops-comp-latest — Disk space occupied for DevOps use case scenarios (lower is better)

	InfluxDB Core	TDengine OSS	InfluxDB vs. TDengine
100 devices	194 MB	8 MB	2425.00%
4,000 devices	7909 MB	306 MB	2584.64%
100,000 devices	7591 MB	720 MB	1054.31%
1 million devices	1858 MB	706 MB	263.17%

TDengine required less disk space to store the TSBS datasets in all scenarios and use cases. Its compression performance ranged from 2.3 times to 25.8 times better than InfluxDB with significantly higher efficiency in the 100,000 devices and smaller-scale categories.

5.2. Compression Ratio

	Raw Data	InfluxDB	TDengine
IoT Scenario 1	1,381 MB	3.96:1	29.38:1
IoT Scenario 2	55,263 MB	7.44:1	29.94:1
IoT Scenario 3	86,353 MB	5.42:1	27.45:1
IoT Scenario 4	14,392 MB	4.34:1	10.11:1
DevOps Scenario 1	670 MB	3.46:1	83.81:1
DevOps Scenario 2	26,819 MB	3.39:1	87.64:1
DevOps Scenario 3	41,904 MB	5.52:1	58.20:1
DevOps Scenario 4	6,984 MB	3.76:1	9.89:1

InfluxDB achieved compression ratios from 3.39:1 to 7.44:1 while TDengine’s compression performance ranges from 9.89:1 to 87.64:1. TDengine was especially effective at compressing the integer metrics in the DevOps scenario.

5.3. Resource Consumption

05-iot-cpu — CPU usage during ingestion and compression of the IoT dataset in Scenario 3

06-iot-mem — Memory usage during ingestion and compression of the IoT dataset in Scenario 3

03-devops-cpu — CPU usage during ingestion and compression of the DevOps dataset in Scenario 3

04-devops-mem — Memory usage during ingestion and compression of the DevOps dataset in Scenario 3

During ingestion and compression, InfluxDB used between 15% and 20% of CPU resources and 12 GB to 23 GB of memory.
CPU and memory resources were mostly used at a consistent rate throughout the ingestion and compression period.
In both use cases, InfluxDB experienced a spike to over 40% CPU and 29 GB of memory when beginning to process the ingested data.
TDengine had higher average usage at 40% CPU and 42 GB of memory in the IoT use case and 28% CPU and 34 GB of memory in the DevOps use case.
TDengine’s total resource usage was significantly lower because all ingestion and processing was completed within five minutes, whereas InfluxDB took almost 30 minutes to ingest and process the same dataset.

6. Analysis

With version 3.0, InfluxDB uses the Apache Parquet file format for storing data. Parquet includes a range of built-in encoding and compression options, but these are not configurable through InfluxDB. The specific encoding and compression algorithms used by InfluxDB for each column in this test are therefore not known.

TDengine’s encoding and compression options are configurable on a per-column basis. The default values are determined based on the data type of the column and have been optimized to provide the best compression performance for that data type. In this test, the default values have been used for all columns.

It is possible that TDengine’s superior compression performance in this test is due to more optimal default values for encoding and compression algorithms or more optimal implementation of those algorithms, but the algorithms available are similar in TDengine and Parquet.

TDengine’s “one table per device” design likely played a larger role in improving compression performance. In this design, one table is created for each device, ensuring that each block of data contains the records for a single table. This ensures that similar data is stored together and can greatly increase compressibility in many time-series scenarios where adjacent values differ only by a small amount.

The drawback of this model is that when datasets contain a small amount of data from a large number of devices, there is significant storage overhead caused by table creation. This is reflected in the results for Scenario 4, including 1 million devices with fewer than 20 records each, which had lower compression performance compared with the scenarios including a larger number of records per device.

7. Reproducing These Results

We encourage you to verify these results and have developed a script with which you can run TSBS tests on your own machine. On an Ubuntu 22 machine, clone our TSBS fork to the /usr/local/src directory. Then open the scripts/tsdbComp directory and run the tsbs_test.sh --help command as the root user.

sudo -s
git clone https://github.com/taosdata/tsbs
cd tsbs
git checkout enh/add-influxdb3.0
cd scripts/tsdbComp
./tsbs_test.sh --help

This describes the scenarios that you can test and the configuration options available to you.

Note that performance testing by nature requires machines with adequate hardware.

If you would like to run the full test suite for the DevOps or IoT use case, use a server with at least 24 cores, 128 GB of RAM, and 500 GB of disk space.
If you prefer to run the tests on a personal computer or smaller virtual machine, select the cputest or iottest scenarios. These scenarios run a subset of TSBS that can return results within 45 minutes on most computers. For these scenarios, a machine with 4 cores, 8 GB of RAM, and 40 GB of disk space is required.

8. Business Impact Analysis

The compression ratio advantage translates directly to business value:

8.1. Storage Cost Savings

Although storage costs have decreased in recent years, TDengine’s high compression can still significantly reduce expenses, especially at scale. Consider Scenario 4 from the IoT use case described above. Over a month, a dataset of this size would generate over 48 TB of data as compressed by InfluxDB, compared with only 20.8 TB in TDengine. Assuming typical enterprise storage costs of $0.10/GB/month for hot storage, including backup and maintenance, this would result in a yearly reduction of $33,200 in storage costs.

8.2. Beyond Storage: Additional Benefits

While this study focuses solely on compression ratio, TDengine’s storage efficiency delivers additional benefits:

I/O Performance Improvement: Reduced data volumes mean fewer I/O operations for the same logical data.
Backup and Disaster Recovery Efficiency: A smaller data footprint reduces backup time and transfer costs.
Network Transfer Reduction: For distributed systems and edge-cloud synchronization architectures, transferring compressed data can significantly reduce bandwidth requirements.

9. Conclusion

This benchmark study conclusively demonstrates TDengine’s superior compression ratio compared to InfluxDB across multiple real-world scenarios. With better compression ratios in all tested datasets, TDengine offers substantial advantages in storage efficiency, cost savings, and overall database performance.

The methodology employed in this study emphasizes transparency, reproducibility, and real-world applicability. By making all test code, data, and procedures publicly available, we invite independent verification of these results.

For organizations dealing with large volumes of time-series data, particularly those with high-cardinality requirements or hybrid edge-cloud deployments, TDengine’s compression advantage represents a significant technological and economic benefit.

TDengine Team