It’s not unusual for a significant amount of an analyst’s time to be spent wrangling data, especially when it comes to time-series data analytics. To address this challenge, one approach is to put time series data into a relational database. There are significant performance challenges to directly inserting data into a relational database, so one might write data to a time-series database (TSDB) first. Then one might export the data from the time series database, format and transform the data for analysis, and then load it into the relational database or data warehouse. This solution works, but it can be complicated and costly.
But another way is to use a time series database that supports the SQL query language. SQL is a powerful tool for selecting, filtering and linking data together for data analytics. TDengine is a time series database that supports SQL. For data analysts, using TDengine is like using a relational database. Through supertables, storage and compute separation, data partitioning by time interval, pre-computation and other means, TDengine provides powerful and easy data analytics capabilities. Specifically, TDengine’s data analytics capabilities have the following highlights:
- Efficient aggregation between multiple data collection points: based on the characteristics of time series data, TDengine proposes an innovative concept of supertable, which is a template for the same kind of data collection points(devices). TDengine stores time series data and label data separately. No JOIN is required, you just need to specify the label filter conditions of the supertable to efficiently aggregate data collection points of the same type, which makes it easier to organize and find data. In addition, TDengine allows you to add up to 128 labels to each data collection point, which you can delete and update later. TDengine provides a powerful way to slice data into cubes for multidimensional analysis.
- Separation of storage and compute: Since 3.0, TDengine supports separation of storage and compute. The system can start one or more query nodes as needed to increase computing resources, speed up complex queries, and reduce Latency. For cloud platforms, the computing node can be a container, which can be started or stopped quickly. The separation of storage and computing takes full advantage of the elastic computing resources of the cloud platform.
- The analysis of historical and real-time data is unified: TDengine automatically partitions the data according to the time interals. Even if it is 10 years of data, there is no need to store data into multiple datbases or multiple tables, and there is no such thing as archived data in TDengine system. In order to reduce storage costs, multi-level storage is implemented according to the age of the data, but it is fully managed by TDengine. Whether it is querying the latest data or data from 10 years ago, only the start and end times of the query are different.
- Unique functions of time series data analytics: Besides the basic functions of standard SQL, TDengine extends the processing of time series data, providing cumulative summation, time weighted average, moving average, rate of change, time window, session window, state window, interpolation and many other time series data analytics functions. Through time window and interpolation, the timestamps of data from different data collection points can be aligned at fixed time intervals to facilitate further subsequent analysis. To learn more, see SQL Manual
- Real-time data analytics: TDengine provides time-driven stream processing (continuous query) and event-driven stream processing. Not only can stream processing be performed on the data stream generated by a single data collection point, but also stream processing can be performed on the aggregated data stream from multiple collection points. The support of User Defined Functions (UDF) enables stream processing to easily provide pre-processing, transformation or any other complex computing of data. For stream processing, see the user documentation stream processing .
- Support Python : Not only TDengine provides Python connectors , but it also supports Pandas and data frames, so that data analysts who love Python can easily use various Python libraries to do various time series data analytics.
- Other convenient means of data access and analysis : TDengine provides command line interface (CLI), you can run various ad hoc queries, or import and export data. TDengine provides R and Matlab connectors, and seamless integration with Grafana and Google Data Studio.
In typical IoT, IIoT scenarios, TDengine can be used as a time-series data warehouse, and it is no longer necessary to import time series data into a special data warehouse or data lake for processing and analysis. The cost of the data platform will be greatly reduced.
For a specific example of analyzing a time-series data set, see Easy Time-Series Analysis with TDengine.
Learn more about TDengine: