Standard SQL Support

Jeff Tao

August 1, 2023 / Engineering

TDengine supports standard SQL for querying data, using a model that is familiar to users of relational databases. SQL support makes it easier to fit TDengine into your existing data stack and reduces the learning curve for new users. This is a departure from many time-series databases (TSDB) and traditional data historians, which often use proprietary query languages and sometimes provide only APIs for querying. In addition, TDengine has extended the standard SQL syntax to facilitate analysis and processing of time series data.

Query Syntax: InfluxDB vs. TDengine

We can demonstrate the simplicity of SQL as opposed to proprietary query languages with a quick example. Suppose we want to query the average voltage per minute of all smart meters in Chicago in the past hour from the database power. The query statement in InfluxDB’s Flux language would be written as follows:

from(bucket:"power")
|> range(start:-1h)
|> filter(fn:(r) =>
r._measurement == "smeter" and
r._field = "voltage" and
r.location == "chicago"
)
|> aggregateWindow(every: 1m, fn: mean)

In TDengine, the same query would be written as follows:

SELECT AVG(voltage) FROM power.smeter WHERE ts > now -1h AND location = "chicago" INTERVAL(1m);

This shows how much easier it can be to compose SQL statements for common time series operations instead of using proprietary query languages.

Data Ingestion

TDengine uses the traditional relational model for data ingestion. In this model, a schema is defined for the data when tables are created. Although this does require extra effort, its advantages quickly become clear as datasets increase in size: querying data remains a simple task even for large-scale data. In addition, because most IoT and industrial devices transmit data with a consistent structure, the benefits of schemaless ingestion are much less evident than in other scenarios where data structure may be unknown or mutable.

For use cases that require schemaless ingestion, however, TDengine supports the InfluxDB Line Protocol and the OpenTSDB telnet line protocol and JSON format protocol.

SQL Extensions

The analysis of time series data has its own characteristics and requires some special analysis functions that are not supported by traditional relational databases. To make it easier to perform analytics, TDengine provides the following extensions to standard SQL:

INTERVAL: TDengine aggregates the collected data by time interval, such as calculating the average voltage every 5 minutes
FILL: At a specific time point, when there is no data, TDengine can perform interpolation calculation based on the data before and after this time point and provide it to the application.
SLIDING: Sliding window. TDengine can perform query calculation at specified time intervals.
LAST_ROW: Returns the last non-NULL value of a column in a table. This is very important for IoT applications, because the current state of the device needs to be quickly obtained.
TAVG: Time-weighted average, which is often used in the Industrial Internet.
SPREAD: count the difference between the maximum and minimum values of a column.