Writing Performance Comparison of TDengine and InfluxDB

Introduction

Performance is a key metric on which users evaluate when choosing a Time-Series Database. In today’s report we show a comparison of writing performance between InfluxDB and TDengine.

In order to be more convincing, this test is based on the scenarios and datasets that InfluxDB has used in previous performance comparisons with Graphite. (https://www.influxdata.com/blog/influxdb-outperforms-graphite-in-time-series-data-metrics-benchmark/)

After comprehensive preparations and repeated tests, we have concluded that:

  1. The writing speed of TDengine is at least twice that of InfluxDB, under the most optimal conditions published by InfluxDB.
  2. When the number of devices is increased to 1000, the writing speed of TDengine is 5.2 times that of InfluxDB.
  3. TDengine achieves higher data compression ratio, and it requires only 63% storage size of InfluxDB’s.

In addition to publishing the test results, we also wrote down the configuration, environment and detailed steps, so any developers or users can reproduce the testing results by themselves.

InfluxDB is an open-source Time-Series Database written in the Go language. Its core is a custom-built storage engine, which is optimized for time series data and is currently the most popular TSDB, ranking first in the DB-Engines TSDB list.

TDengine is an open source time-series database with high performance, scalability and SQL support. In addition, it has built-in caching, stream processing, data subscription and other functions to reduce the complexity and costs of development and operations. TDengine built its own storage engine from scratch.

Now, let’s dive into the testing session.

1. Basic Information:

TDengineInfluxDB
Brief IntroHigh-Performance, Scalable Time-Series Database with SQL supportA Time-Series Database designed for time series, event and indicator data management
Official Sitehttps://www.tdengine.comhttps://www.influxdata.com
Development LanguageCGo
Test Version2.0.18.01.8.4
Downloadwww.tdengine.cominfluxdata
Write Methodcgo(go1.16)rest

The dataset used in this test is modeled for a DevOps metrics monitoring use case. In this scenario, a group of servers need to report system and application metrics periodically. The specific implementation involves sampling 100 values from 9 subsystems (CPU, memory, disk, disk I/O, kernel, network, Redis, PostgreSQL, and Nginx) of a server every 10 sec. Since InfluxDB chose a cycle of 24 hours and a setting of 100 devices for their comparison with Graphite, we also reuse this relatively modest deployment.

The important parameters are as follows, which can be seen in the links above:

2. Environment Preparation

To make it easier for you to reproduce, all our tests were performed on two Azure virtual machines running Ubuntu 20.10, configured as follows:

  1. Standard E16as_v4 ©AMD EPYC 7452(32-Core Processor 2345.608 MHz,16vCPU, 128GB RAM, 5000 IOPS SSD 1024GB) database server.
  2. Standard F8s_v2 instance type ©Intel(R) Xeon(R) Platinum 8272CL (2.60GHz ,8vCPU,16 GB RAM)database clients.

It is important to note that although the server CPU above is displayed as 32 cores, the cloud service only allocated 16 processors.

3. Specific Test Methods and Steps:

We can reproduce the results of this test simply by following the operations below:

3.1. Installation preparation:

After deploying TDengine, InfluxDB and Go language environment, please ensure that the databases of the two servers are connected and can be used normally. Minimally, you should be able to create and delete a database and insert and query some data. If you do so, please delete the database you created. Please resolve any installation or other issues before proceeding with the steps below.

The following points should be watched during testing: ( the default TDengine configuration file is /etc/taos/taos.cfg )

  1. The settings of fsync should be kept synchronized. InfluxDB defaults to fsync without delay. It is necessary to modify these two parameters of TDengine: walLevel=2 and fsync=0 to achieve the same configuration environment. All subsequent tests are completed under this setting.
  2. The client of TDengine should open maxSQLLength to a maximum of 1048576.
  3. TDengine client needs to be installed on the client server. (note: bulk_load_tdengine compilation needs to rely on TDengine client)

3.2. Download code from Github and execute on client server:

git clone https://github.com/taosdata/timeseriesdatabase-comparisons

3.3. Prepare the compilation environment, generate writing program. The root directory of timeseriesdatabase-comparisons is the working directory:

cd timeseriesdatabase-comparisons/build/tsdbcompare/
./compilation.sh

3.4. Switch to the build/tsdbcompare directory and run the script to reproduce the test results:

cd timeseriesdatabase-comparisons/build/tsdbcompare/  && ./loop_scale_to_server.sh -a "test217"  

# The hostname of my client server here is "test217"
# loop_scale_to_server.sh invokes write_to_server.sh to finish the entire testing process
# . /writ_to_server.sh -h can help check the corresponding parameters (loop_scale_to_server.sh is consistent with the above parameter)

The parameters to generate data (the total records =(t-e)*24*3600/ i * s)
i : data interval, default 10s
s : the amount of sample, default 100
t : the start time of data, default'2018-01-01T00:00:00Z'
e : the end time of data, default'2018-01-02T00:00:00Z'
g : if generate data, default 1(1: generate,0: not generate)
T : the default data directory of TDengine is "/var/lib/taos"
I : the default data directory of InfluxDB is "/var/lib/influxdb"

The write parameters 
b : batchsize (default 5000)
w : workers (default 16)
n : the writing mode of TDengine (false: cgo, true: rest, default false)
a : TDengine and InfluxDB's hostname or IP address, default 127.0.0.1

If you want to customize various argument value, you can check the  "write_to_server.sh" to read the code.

(Note: If the writing fails due to interference factors, you can manually input the arguments again to get the test result. For example, write_to_server.sh  -a “test217” -T “/var/lib/taos/” -I “/var/lib/influxdb/” -t ‘2018-01-01T00:00:00Z’ -e ‘2018-01-02T00:00:00Z’ )

4. Actual Measurement Data

After the tests, we aggregated and tabulated the results as shown below. It is clear that TDengine maintains 2 times the speed regardless of whether it is single-threaded or multi-threaded or whether it is a small batch or a large batch.

In the case of 5000 batches and 16 workers (the test item in the comparison report between InfluxDB and Graphite), InfluxDB took 35.04 seconds, while TDengine took only 17.92 seconds.

In addition, InfluxDB has only been tested with 100 devices and 900 monitoring points. However, in our opinion, the number of devices and monitoring points in practical application scenarios far exceeds this number. So we adjusted the script parameters, gradually increasing from 100 devices to 200, 400, 600, 800, and 1000. By increasing the data volume equally for both databases, we were able to obtain a good comparison of writing performance for both databases in the case of more connected devices .

Note that the data table is attached at the end of this article. In the case of a single thread writing 1 row from 1000 devices, naturally the time taken is too long so we do not include that in the table. This does not affect the actual result – that after multiplying the number of devices, TDengine still maintains a steady lead and continue to keep its advantages.

Also, for the same amount of raw data, InfluxDB uses around 145M storage, and TDengine uses around 92M storage. TDengine achieves higher data compression ratio, it only requires 63% storage size of InfluxDB’s.

5. Conclusion 

The current test results have strongly demonstrated the two conclusions in the preface:

  1. Under the optimal conditions published by InfluxDB, the writing speed of TDengine is at least two times faster than InfluxDB.
  2. When the number of devices is enlarged to 1000, the writing speed of TDengine is 5.2 times faster than InfluxDB.

Below we show 2 line graphs for the 5000 batch, 16 workers test condition. The horizontal axis shows the number of devices.

Figure 1 represents the number of seconds it takes for both parties to write the same amount of data, and Figure 2 represents the number of rows written by both parties per second.

These two graphs fully illustrate one point: the more devices and the larger the amount of data, the more obvious the advantages of TDengine.

Given that the interface types for this performance test are inconsistent i.e. TDengine uses the cgo interface and InfluxDB uses REST, there will be a small amount of performance fluctuation. However, it won’t fundamentally change the results, and we will test other interfaces and scenarios in the follow-up comparion articles.

If you are interested in more details, you can use the test code above to reproduce the result by yourself, and we welcome your valuable suggestions.

The full record of test data is attached: