TDengine uses various kinds of caching techniques to efficiently write and query data. This article describes the caching component of TDengine.
TDengine uses an insert-driven cache management policy, known as first in, first out (FIFO). This policy differs from read-driven “least recently used (LRU)” cache management. A FIFO policy stores the latest data in cache and flushes the oldest data from cache to disk when the cache usage reaches a threshold. In IoT use cases, the most recent data or the current state is most important. The cache policy in TDengine, like much of the design and architecture of TDengine, is based on the nature of IoT data.
When you create a database, you can configure the size of the write cache on each vnode. The vgroups parameter determines the number of vgroups that process data in the database, and the buffer parameter determines the size of the write cache for each vnode.
CREATE DATABASE db0 VGROUPS 100 BUFFER 16MB
In theory, larger cache sizes are always better. However, at a certain point, it becomes impossible to improve performance by increasing cache size. In most scenarios, you can retain the default cache settings.
When you create a database, you can configure whether the latest data from every subtable is cached. To do so, set the
cachemodel parameter as follows:
none: Caching is disabled.
last_row: The latest row of data in each subtable is cached. This option significantly improves the performance of the
last_value: The latest non-null value in each column of each subtable is cached. This option significantly improves the performance of the
LASTfunction in normal situations, such as
GROUP BY, and
both: Rows and columns are both cached. This option is equivalent to simultaneously enabling option last_row and last_value.
To improve query and write performance, each vnode caches the metadata that it receives. When you create a database, you can configure the size of the metadata cache through the
CREATE DATABASE db0 PAGES 128 PAGESIZE 16kb
The preceding SQL statement creates 128 pages on each vnode in the db0 database. Each page has a 16 KB metadata cache.
File System Cache
TDengine implements data reliability by means of a write-ahead log (WAL). Writing data to the WAL is essentially writing data to the disk in an ordered, append-only manner. For this reason, the file system cache plays an important role in write performance. When you create a database, you can use the
wal parameter to choose higher performance or higher reliability.
1: This option writes to the WAL but does not enable fsync. New data written to the WAL is stored in the file system cache but not written to disk. This provides better performance.
2: This option writes to the WAL and enables fsync. New data written to the WAL is immediately written to disk. This provides better data reliability.
In addition to the server-side caching discussed previously, the core client library
libtaos.so also makes use of caching. TDengine Client caches the metadata of all databases, supertables, and subtables that it has accessed, as well as the cluster topology.
If a client modifies certain metadata while multiple clients are simultaneously accessing a TDengine cluster, the metadata caches on each client may fail or become out of sync. If this occurs, run the
reset query cache command on the affected clients to force them to obtain fresh metadata and reset their caches.