Query Performance Comparison Test Report: TDengine vs. InfluxDB

We have previously published a write performance test report for TDengine and InfluxDB, and today we will compare the query performance of the two time-series database products.

1.Preface

This report contains basic tests as well as extended test cases for specific scenarios. This allows us to accurately, and comprehensively, show the query performance of the two database products in different application scenarios.

1.1 Hardware Environment

In order to facilitate reproducibility of the test, we built the test environment on the Microsoft Azure cloud service. Two servers are used in the test, and the client and server are deployed separately. The client and server are connected through the intranet of the cloud service.

The specific configuration of the two servers is as follows.

Client8C,Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz16G64G SSD, 4k random reads with no writes, IOPS 5000
Server16C, AMD EPYC 7452 32-Core Processor128G1T SSD, 4k random reads with no writes, IOPS 25000

The operating system is Ubuntu 20.10 Linux. The version of Go used is 1.16.

1.2 Software Installation

TDengine – we used TDengine 2.1.7.2  Community edition. You can download it from the TDengine website. It can also be generated by compiling the code yourself from GitHub. The information for this version of git is as follows.

community version: 2.1.7.2 compatible_version: 2.0.0.0

gitinfo: c6be1bb809536182f7d4f27c0d8267b3b25c9354

InfluxDB  – We used InfluxDB 1.8.4. Since the performance testing framework can only adapt up to version 1.8, we chose InfluxDB 1.8 for comparison. It can be downloaded from the InfluxData website.

1.3 Run the script to generate the query data

1.3.1 Installation preparation

After deploying TDengine, InfluxDB and Go language environment, please ensure that the databases of the two servers are connected and can be used normally. Minimally, you should be able to create and delete a database and insert and query some data. If you do so, please delete the database you created. Please resolve any installation or other issues before proceeding with the steps below.

The following points should be noted during testing: ( the default TDengine configuration file is /etc/taos/taos.cfg )

  1. The settings of fsync should be kept synchronized. InfluxDB defaults to fsync without delay. It is necessary to modify these two parameters of TDengine: walLevel=2 and fsync=0 to achieve the same configuration environment. All subsequent tests are completed under this setting.
  2. The client of TDengine should open maxSQLLength to a maximum of 1048576.
  3. TDengine client needs to be installed on the client server. (note: bulk_load_tdengine compilation needs to rely on TDengine client)

1.3.2 Download code from Github and execute on client server:

git clone https://github.com/taosdata/timeseriesdatabase-comparisons

1.3.3. Prepare the compilation environment to generate the writing program. The root directory of timeseriesdatabase-comparisons is the working directory:

cd timeseriesdatabase-comparisons/build/tsdbcompare/
./compilation.sh

1.3.4. Switch to the build/tsdbcompare directory, generate the test data and insert it to the database.

Execute ./writ_to_server.sh -a "test217" under build/tsdbcompare

# The client server hostname of this test is "test217"
# ./writ_to_server.sh -h can view the corresponding parameters:

The parameters to generate data (the total records =(t-e)*24*3600/ i * s)
i : data interval, default 10s
s : the amount of sample, default 100
t : the start time of data, default '2018-01-01T00:00:00Z'
e : the end time of data, default '2018-01-02T00:00:00Z'
g : if generate data, default 1 (1: generate,0: not generate)
T : the default data directory of TDengine is "/var/lib/taos"
I : the default data directory of InfluxDB is "/var/lib/influxdb"

The write parameters 
b : batchsize (default 5000)
w : workers (default 16)
n : the writing mode of TDengine (false: cgo, true: rest, default false)
a : TDengine and InfluxDB's hostname or IP address, default 127.0.0.1

If you want to customize various argument value, you can check write_to_server.sh to read the code.

1.3.5. Generate a query file and perform the query test, run the script under build/tsdbcompare:

./read_all.sh -a "test217" 
The script parameters are the same as write_to_server.sh

2. Running the Test

Running this test requires that the TDengine system log be turned off.  Run read.sh. The test is simply executed automatically.

When switching between scenarios, the database backend service (Influxd/taosd) is restarted and the cache of the Linux system is cleared.

3. The Comparison of Test Results

This subsection describes the comparative test results obtained by running the test script, and provides a preliminary analysis of them.

For the test results, all responses are the times automatically recorded by the test script, i.e. the time is not the response time of a single query execution, but the time finally obtained by completing 1000 repetitions of the query. It should be noted that due to the long duration of the whole test, the data obtained by the test is not at the same moment. The test program runs at different times and is affected by the maximum performance that the cloud server can perform, and slight jitter can be seen in the result obtained, but the overall trend is consistent.

The 1-day dataset generated by the 100-device simulation generated 900 sub-tables in TDengine, with each device generating data at 10-second intervals, for a total of 7,776,000 records.

The 1000 devices generated 9000 sub-tables in TDengine, with each device generating one record at 10-second intervals, for a total of 77,760,000 records.

The test consists of four  scenarios as follows:

  1. After 8 timelines are randomly filtered by label, the maximum value is taken.
  2. Randomly select a 1-hour time interval, randomly filter 8 timelines by label, and get the maximum value of them.
  3. Randomly select a 12-hour time interval, randomly filter 8 timelines by label, use 10 minutes as a time window, and get the maximum value of each.
  4. Randomly select a 1-hour time interval, randomly filter 8 timelines by label, use 1 minute as a time window, and obtain the maximum value of each.

The test results show that, TDengine outperforms InfluxDB in many scenarios (single-threaded and multi-threaded), and in a few cases, the difference between TDengine and InfluxDB is very small. In more scenarios, TDengine has a performance advantage of several times, up to 40 times in some scenarios.

3.1 100-device simulation

Run the test program on the result set generated by simulating 1 day of data on 100 devices. This section presents the performance comparison test results.

Figure 1. Comparison of query results on the 100-device dataset

As can be seen from Figure 1, among all the scenarios, only the third test scenario with single-threaded TDengine query response time exceeds InfluxDB, while all other scenarios outperform InfluxDB, and some scenarios (scenario 2) have a huge advantage of 40 times in query performance. The specific test response data is shown in Appendix 1.

Figure 2. response time by adjusting query threads in different scenarios

The test results on 1000 devices show that TDengine still shows a large performance advantage, even in some scenarios where it is slower than InfluxDB (Scenario IV in the multi-threaded scenario), the difference is very small. The leading part still has a huge performance advantage, with the highest performance difference reaching nearly 20 times, and the specific query response data is shown in Appendix 2.

3.2 Extended Tests

In addition to the two standard tests mentioned above, we designed a series of extended tests based on the existing data set in order to more comprehensively and accurately evaluate the performance of the two database products under different scenarios. In the following tests, we only use the results of cgo’s running tests.

3.2.1 Impact of label filter volume on query performance

Adjust the number of filtering hosts, set the work of parallel execution to 16, and execute all queries using the data set of 1000 devices. The response time of the query is in seconds, and the smaller the value, the better.

Figure 3. query response time by changing filtering timeline  in different scenarios

It can be seen that the query response time of InfluxDB tends to increase rapidly in all four test scenarios as the screening timeline increases, while TDengine shows a relatively stable query response time for larger data screening size, and shows a greater advantage as the screening timeline size increases. It can be inferred that the query response time of InfluxDB will continue to increase rapidly as the query size continues to grow. The specific query response times under various scenarios are shown in Appendix 3.

3.2.2 The impact of query time window on query performance

To evaluate the impact of different lengths of time windows on query performance, we selected the fourth query scenario, setting the number of work executed in parallel to 16, and the time intervals are randomly selected as 1h/2h/4h/8h/12h continuous time periods, with a single aggregation time window maintained at 1 min. The query response time is shown in Figure 4.

Figure 4. The effect of changing the query time interval range on the query response

As Figure 4 shows, TDengine has better query performance compared with InfluxDB, and TDengine’s lead continues to expand as the query time interval increases, and when the query time interval is 1 hour, TDengine is only about 8% faster than InfluxDB. However, when the query interval increases to 12 hours, TDengine’s query advantage has increased to nearly two times. The specific query response times are shown in Table 4.

3.2.3 Performance of complex query

Considering that only simple query functions are used in the standard test, we evaluate the query performance using complex queries with multiple query functions combined. We chose the fourth scenario, where a randomly selected time period of 1 hour is used, with an aggregation time window of 1 minute, and 8 timelines are filtered for query processing.

The situations we test:

  1. max(value), count(value),first(value),last(value)
  2. top(value, 10)
  3. max(value),count(value),first(value),last(value),integral(value1)*spread(ts)/1000   There is no integral function in TDengine ,so we use TWA(value) * Spread(value) / 1000 instead. 
Figure 5. Comparison of response time under different query scenarios

Figure 5 show that TDengine query performance outperforms InfluxDB under the query conditions of three complex function combinations, especially in the first combination scenario, TDengine’s performance is 2.5 times that of InfluxDB. The specific query response times are shown in Appendix 5.

3.2.4 Data Reading Test

In this scenario test, we tested the data reading performance of TDengine and InfluxDB. For the entire dataset, we do not limit the query time frame, adjust the filter conditions of the tags, and project the query to get the entire data content. The results are shown in Figure 6.

Figure 6. Projected query response time for different data sizes

As can be seen, the total time overhead of TDengine is stable at around 11% of InfluxDB’s for different extraction ratios, i.e. TDengine outperforms InfluxDB by 8.78 times for this test, and the advantage expands as the number of timelines increases, reaching 9.37 times at 128 timelines. That is, TDengine exhibits better performance benefits at larger data sizes. At 256 timelines, InfluxDB eventually failed to complete the test run, with the server experiencing connection rejection, and TDengine finished the test in only 365.61 seconds.

4. Summary

In tests run under this comparative testing framework, TDengine demonstrates a large performance advantage over InfluxDB, especially in extended tests with more diverse conditions and variable control, and we see that TDengine consistently shows a large performance advantage over InfluxDB.

Appendix

The specific data from the comparison test runs are summarized below.

Appendix 1. Comparison of query results on the 100-device dataset

Appendix 2. Comparison of query results on the 1000-device dataset

  Appendix 3. Adjust the number of filtering tags

Appendix 4. Time range query response of different lengths (seconds)

Appendix 5. Performance of complex queries

Appendix 6. Performance of data reads of different sizes (seconds)