An in-depth look at the performance of key time-series databases
For a quick overview, see our infographic
IoT devices – from the smart devices in your home to the equipment in a modern power plant – are continuously collecting and transmitting information. It’s no surprise, then, that IoT datasets are significantly larger and pose different challenges than traditional datasets. Because IoT data is generated in real time, with devices continuously sending updates, sensor readings, and events at a rapid pace, managing and processing this data requires a high-performance data infrastructure.
The purpose-built time series database (TSDB) is an essential element of such an infrastructure, either as a complement to or a replacement for traditional relational databases and data historians. With increased interest in time-series databases in recent years, a number of TSDB products have entered the market. However, not all databases are created equal, especially when it comes to performance.
The performance of your TSDB doesn’t just impact your ability to ingest, store, and analyze large amounts of data; it directly affects your total cost of ownership (TCO). Better ingestion rates, query response times and compression ratios mean your system consumes fewer resources to process the same amount of data. To demonstrate TDengine’s robust performance, we evaluated the platform against two key players in this space – InfluxDB and TimescaleDB – in an IoT scenario.
The findings clearly show that TDengine offers significant advantages over both InfluxDB and TimescaleDB in terms of data ingestion, compression, and query performance. In addition, TDengine uses fewer server-side CPU resources than either competing time series DB to process identical datasets.
Objective Evaluation via the Time Series Benchmark Suite
To ensure an even playing field, we used the open source Time Series Benchmark Suite (TSBS) framework and ran the tests on identical systems in AWS. TSBS is designed for objective database evaluations and generates datasets for a range of recommended ingestion and query scenarios. TSBS is used by other database providers, including VictoriaMetrics and Timescale, to perform evaluations similar to the one described here.
TSBS is an open and independent framework, meaning the test procedures and datasets generated aren’t designed to benefit any database platform, and allows anyone to conduct an objective evaluation.
This evaluation applied the TSBS IoT dataset, which includes small- and large-scale scenarios that simulate a connected cars use case, with diagnostics and readings for a fleet of trucks. This dataset is more complex than the DevOps use case, including out-of-order and missing data. For detailed information, see the official TSBS repository in GitHub.
This section gives an overview of the performance metrics for each product tested. Download the full report to see all of the test cases.
Time series databases need to ingest massive amounts of data, and TDengine achieves the fastest ingestion speeds across all TSBS scenarios, ranging from 1.04 to 16 times the speed of the other products.
In addition, TDengine uses less processing power than InfluxDB or TimescaleDB to ingest the datasets. At its peak, InfluxDB’s CPU usage even reaches 100% during the ingestion process, while TDengine remains under 17%. Although TimescaleDB used a similar amount of CPU resources to TDengine, it spent far more time to compress and order the data after writing it to the database.
As performance can differ based on a number of factors, the TSBS framework covers a wide range of query types. TDengine provided the fastest query response across all scenarios, confirming that organizations dependent on real-time analytics are best served with this purpose-built platform.
More complex queries allowed TDengine to show off its processing power, reaching 87.1 times the performance of TimescaleDB in the long-daily-sessions scenario and 132 times the performance of InfluxDB in the stationary-trucks scenario. This demonstrates that TDengine is best prepared to handle the most performance-intensive queries without slowing down.
In smaller-scale scenarios, al three database products took up a similar amount of disk space. When the datasets increased to one million or ten million devices, however, the benefits of TDengine’s storage design and architecture came into play; for large-scale datasets, TDengine uses less than half the storage resources that InfluxDB requires.
TimescaleDB had a significantly higher disk footprint in the two largest scenarios. The clearest example was in the ten million device scenario, where data processed by TimescaleDB occupied more than 12 times the disk space used by TDengine.
Across all key test metrics for ingestion, compression, and querying, TDengine clearly emerges as the highest-performing time series database.
- Ingestion: TDengine writes the test data between 1.04 to 3.3 times faster than TimescaleDB, and 1.8 to 16 times faster than InfluxDB, with significantly lower CPU overhead.
- Compression: Due to its efficient data storage and compression features, TDengine consumes up to 12 times less disk space than TimescaleDB, and 2.8 times less than InfluxDB.
- Queries: TDengine has the fastest query response time across all scenarios. For this use case, TDengine responds up to 13.6 times faster than TimescaleDB and up to 426 times faster than InfluxDB.
Unlike all-purpose databases like MySQL or PostgreSQL, TDengine was designed from the ground up to simplify and scale time series data management. The platform’s innovative storage engine makes full use of the unique characteristics of time series data, with novel concepts like a single table for each data collection point, which enables better ingestion, and data compression, and supertables, which speed up aggregation operations.
Best in Class TSDB
The performance advantages shown by this evaluation indicate that TDengine excels at time-series data processing, especially with larger datasets and more complex queries. TDengine also requires fewer resources, significantly reducing the TCO of data operations. These advantages, combined with its comprehensive feature set and ease of use, make TDengine the best option for growing enterprises to scale their data pipelines.