Learn TSDB performance benchmarking across write throughput, query latency, compression ratio, out-of-order data handling, TSBS tests, and production-scale interpretation.
In IoT, Industrial IoT (IIoT), and financial monitoring, time-series databases have become core infrastructure for storing high-volume time-series data. With multiple products on the market, teams need a systematic way to evaluate candidates before making a selection. The most important benchmark dimensions are write performance, query performance, and compression ratio.
1. Why performance benchmarking matters
Performance benchmarking matters at three levels.
Technology selection. Without systematic benchmarks, selection decisions are driven by marketing rather than data. A rigorous test cuts through competing claims and reveals how each product behaves under your actual workload profile.
Capacity planning. Benchmark results provide the data points needed to estimate the cluster size, hardware configuration, and network capacity required to handle projected data volumes with adequate headroom.
Cost optimization. Storage costs, compute costs, and operations overhead all depend on the performance characteristics of the chosen database. A database that achieves 10:1 compression uses one-tenth the storage of one that achieves 2:1, and the cost difference compounds every year data is retained.
2. Write performance
Points per second (PPS) is the core throughput metric for write performance. Benchmarking should measure both single-thread and multi-thread write throughput, test long-run stability over hours or days rather than seconds, and evaluate the impact of batch size on throughput. Small batches cause excessive network round-trips. Large batches can cause memory pressure. The optimal batch size is product-specific and worth finding through testing.
TDengine can reach tens of millions of data points per second under batch-write workloads, depending on hardware, configuration, and data model design. This performance comes from storage engine optimizations such as intelligent batch merging, pipeline processing that overlaps network I/O, data parsing, and storage writes, and out-of-order data caching that prevents a single late-arriving data point from stalling the entire write pipeline.
Out-of-order data handling deserves specific attention in benchmarks. Real-world scenarios involve network delays, clock drift across devices, and batch retransmission, all of which produce data that does not arrive in strict timestamp order. Benchmark tests should include scenarios with 5%, 20%, and 50% disorder ratios to measure how gracefully the database handles disorder. A database that performs well only with perfectly ordered data will underperform in production.
3. Query performance
Query benchmarking should cover the full spectrum of query patterns.
Raw data queries include point queries (retrieve a single timestamp’s value), range queries (scan all values between T1 and T2), and multi-device parallel queries (retrieve the same metric across thousands of devices simultaneously).
Aggregation queries include COUNT, AVG, SUM, MIN, and MAX with grouping dimensions, group-by aggregation across tag values, and time-window aggregation such as hourly or daily rollups. Leading time-series databases use pre-aggregation and vectorized query execution to deliver aggregation results without scanning raw data.
Downsampling benchmarks should measure raw computation speed, the overhead of maintaining pre-computed downsampled data, and the performance of hybrid queries that span both original and downsampled data.
Multi-table JOIN queries test the ability to correlate time-series data with dimension tables, align multiple time series on timestamps, and optimize JOIN algorithms for time-series access patterns.
4. Compression ratio
Time-series data has strong temporal continuity, which creates significant compression potential. Columnar storage amplifies this: values within a column are of the same type and often within a narrow range, data locality is better than in row-oriented storage, and vectorized compression using SIMD instructions becomes practical.
Common encoding algorithms each target a different pattern. Delta encoding works for timestamps that increase monotonically. Delta-of-delta encoding is effective for stable sampling rates where the delta itself changes slowly. XOR encoding compresses floating-point values that are numerically close to their neighbors. Run-length encoding handles slowly changing status indicators and enumeration values.
Real-world compression ratios for IoT time-series data typically range from 5:1 to 20:1 depending on data characteristics. High-frequency analog signals with small value changes compress more aggressively than sparse, noisy data. Advanced algorithms can achieve 10:1 or better for typical Industrial IoT (IIoT) workloads.
5. Benchmarking tools and methodology
TSBS (Time Series Benchmark Suite) is a widely used benchmarking tool, supporting InfluxDB, TimescaleDB, TDengine, and other databases with standardized data generation, workload definition, and result collection. Using a common tool eliminates the variable of test harness quality from comparisons.
Custom test sets are recommended as a complement to standard benchmarks. Production data samples capture your actual data distribution. Real business query patterns capture your actual access patterns. Edge cases that your operations team has encountered should be encoded as specific test scenarios.
Fair comparison principles: use identical hardware for all candidates, tune each database’s configuration to its documented best practices for the workload, test at the same data scale across all products, and run each test multiple times to account for variance. Comparing a tuned database against an out-of-the-box competitor proves nothing.
6. Interpreting results at different scales
At small scale (millions of data points per day), the performance differences between databases may be negligible. Focus instead on operational complexity, ecosystem compatibility, and team learning curve. The database that your team can operate well is better than the one that benchmarks slightly faster.
At medium to large scale (tens to hundreds of millions of data points per day), performance differences become material. Focus on write throughput scalability as data volumes grow, query latency stability under concurrent query loads, and cluster sharding strategies for sustained growth.
At ultra-large scale (billions of data points per day or more), distributed architecture becomes the primary concern. Compression ratio’s impact on total storage cost is magnified at this scale. High Availability (HA) and disaster recovery mechanisms move from nice-to-have to mandatory. Well-optimized domestic time-series databases now compete effectively with international products at this tier.
7. Conclusion
Performance benchmarking of time-series databases is a systematic engineering effort, not a single test run. Start from the three core dimensions of write performance, query performance, and compression ratio. Use TSBS or custom test sets built from your own data. Test at the scale that matches your projected growth. A strong lab result means little if the database cannot be operated reliably in production.


