Scaling vertically (or scaling up) refers to increasing hardware resources on an existing server. Ensuring that database management systems are vertically scalable and can make use of these new resources is not an easy task, requiring a well-designed framework with minimal resource consumption internally. Considering the large datasets typically processed by a time-series database, scaling up and scaling out are both essential capabilities.
To test whether TDengine is vertically scalable, we can construct an environment in which a specific hardware resource is the performance bottleneck of the system, and then gradually add resources to the system to see how performance is affected. In this article, the vertical scalability of TDengine will be tested in terms of CPU and disk resources.
Scaling Up CPU Cores
This test is intended to prove that, in a scenario where memory and disk resources cannot become bottlenecks, there is a direct relationship between the number of cores available to TDengine and the write speed of TDengine.
Methodology
In this test, TDengine 3.3.0.0 was deployed from the official Docker image on Docker Hub. A fixed number of cores were bound to the Docker container. The taosBenchmark tool was used to write data to TDengine using the smart meters schema described in the official documentation. The number of vgroups and write threads were both set to 4 initially, but when testing 5 or more cores, these parameters were increased to 12. This ensures that CPU resources remain the performance bottleneck.
The test data is described as follows:
- Databases: 1
- Supertables: 1
- Subtables: 100
- Rows per subtable: 1 million
- Total records: 100 million
Results
CPU Cores | Vgroups | Write Threads | Write Speed (Rows per Second) | Disk I/O (MBps) | Available Memory (GB) | CPU Usage |
---|---|---|---|---|---|---|
1 | 4 | 4 | 380,000 | 13 | 200 | 96% |
2 | 4 | 4 | 880,000 | 36 | 200 | 95% |
3 | 4 | 4 | 1,480,000 | 45 | 198 | 95% |
4 | 4 | 4 | 1,930,000 | 60 | 193 | 88% |
5 | 12 | 12 | 2,450,000 | 73 | 104 | 96% |
6 | 12 | 12 | 2,880,000 | 85 | 104 | 94% |
12 | 12 | 12 | 5,730,000 | 167 | 102 | 90% |
The disk drive in the test scenario was a solid-state disk (SSD) with a maximum throughput of 350 MBps, meaning that I/O resources were sufficient throughout the test. The server and client in the test scenario were run on the same machine, meaning that network bandwidth could not become a bottleneck.
Each time a core was added, write performance increased by approximately 500,000 rows per second. Disk I/O increased at approximately the same rate as write speed. From this it can be concluded that TDengine is vertically scalable in terms of CPU cores.
Scaling Up Disk Drives
This test is intended to prove that, in a scenario where CPU and memory resources cannot become bottlenecks, there is a direct relationship between the number of disks available to TDengine and the write speed of TDengine.
Methodology
In this test, TDengine 3.3.0.0 was deployed on Ubuntu 20.4 with 24 cores and 64 GB of RAM. 5400 RPM HDDs were mounted to this system in the same tier — relatively slow disks to ensure that disk I/O remained the bottleneck during this test. The taosBenchmark tool was used to write data to TDengine using the following schema:
Columns | Data Type | Length |
---|---|---|
1 | TIMESTAMP | 8 |
1 | DOUBLE | 8 |
1 | BIGINT | 8 |
5 | BINARY | 5000 |
3 | NCHAR | 2500 |
In this schema, each row of data occupies 32,524 bytes. These relatively large rows consume I/O resources faster.
The test data is described as follows:
- Databases: 1
- Vgroups: 16
- Supertables: 1
- Subtables: 100
- Rows per subtable: 10,000
- Total records: 1 million
taosBenchmark was additionally configured as follows:
- Write threads: 16
- Write method: parameter binding mode
Results
Disks | Committed Threads | Write Speed (Rows per Second) | CPU Usage | Disk I/O (MBps) |
---|---|---|---|---|
1 | 4 | 1024 | 5% | 80–90 |
2 | 4 | 1929 | 11% | 150–190 |
3 | 8 | 4098 | 19% | 380–450 |
The 24-core CPU used in the test environment was mostly idle throughout the tests, meaning that compute resources were sufficient. The server and client in the test scenario were run on the same machine, meaning that network bandwidth could not become a bottleneck.
It was necessary to increase the number of committed threads once three disks are mounted to ensure that I/O performance remains the bottleneck. Committed threads are the threads used to write data to disk.
Each time a disk was mounted, write performance increased linearly and the new disk was used fully. From this it can be concluded that TDengine is vertically scalable in terms of disk drives.