TDengine 3.0 Data Subscription

Haojun Liao

December 1, 2022

TDengine provides data subscription and consumption interfaces similar to message queue products. These interfaces make it easier for applications to obtain data written to TDengine either in real time and to process data in the order that events occurred. This simplifies your time-series data processing systems and reduces your costs because it is no longer necessary to deploy a message queue product such as Kafka.

Overview

To use TDengine data subscription, you define topics like in Kafka. However, a topic in TDengine is based on query conditions for an existing supertable, table, or subtable — in other words, a SELECT statement. You can use SQL to filter data by tag, table name, column, or expression and then perform a scalar function or user-defined function on the data. (Note that aggregate functions are not supported.)

This gives TDengine data subscription more flexibility than similar products. The granularity of data can be controlled on demand by applications, while filtering and preprocessing are handled by TDengine instead of the application layer. This implementation reduces the amount of data transmitted and the complexity of applications.

By subscribing to a topic, a consumer can obtain the latest data in that topic in real time. Multiple consumers can be formed into a consumer group that consumes messages together. Consumer groups enable faster speed through multi-threaded, distributed data consumption. Note that consumers in different groups that are subscribed to the same topic do not consume messages together.

A single consumer can subscribe to multiple topics. If the data in a supertable is sharded across multiple vnodes, consumer groups can consume it much more efficiently than single consumers. TDengine also includes an acknowledgement mechanism that ensures at-least-once delivery in complicated environments where machines may crash or restart.

Implementation

To implement these features, TDengine indexes its write-ahead log (WAL) file for fast random access and provides configurable methods for replacing and retaining this file. You can define a retention period and size for this file. In this way, the WAL file is transformed into a persistent storage engine that remembers the order in which events occur.

TDengine then uses the WAL file instead of the time-series database as its storage engine for queries in the form of topics. TDengine reads the data from the WAL file; uses a unified query engine instance to perform filtering, transformations, and other operations; and finally pushes the data to consumers.

Usage

You can subscribe to a database, supertable, or column. For example, to create a subscription to a column, use the following SQL syntax:

CREATE TOPIC topic_name AS topic_query;

Replace topic_query with the SELECT statement that obtains the desired data from the column.

Once you have created topics in TDengine, you can code consumers that consume the messages in the queue for each topic.

For more information, see the official documentation.

Download TDengine OSS and get started in 60 seconds

Haojun Liao
Haojun Liao is Co-Founder & Query Engine Architect at TDengine and is responsible for the development of query processing component of the product. He has a Ph.D. in Computer Applied Technology from the Institute of Computing Technology (Chinese Academy of Sciences) and is focusing on time series data/spatial data analysis and processing.