A time-series database (TSDB) is a database management system that is optimized to store, process, and analyze time-series data.
Time-series data is a sequence of data points that represent the changes of a measurement or events over a time period. Each data point is always timestamped, and the sequence of data points is indexed or listed by timestamp. The data generated by sensors on industrial equipment, smart devices, IT monitoring systems, and stock market trades are all examples of time-series data.
It is possible to process time-series data with relational or NoSQL databases, but purpose-built time-series databases are optimized to handle the special characteristics of time-series data. This means that time-series databases are much more efficient in terms of ingestion rate, query latency, and data compression. In addition, time-series databases include special analytic functions and data management features so that you can develop applications more easily.
How Is Time-Series Data Used?
Time-series data is often used to look for insights into operations, raise alerts based on real-time analysis, and forecast future trends. The following characteristics are found in time-series data applications:
- High write-read ratio: Internet applications like Twitter and LinkedIn have single posts that are read by millions of users, but raw time-series data points are scanned and analyzed mainly by applications and algorithms.
- Retention policy: In general, time-series data is not stored forever. Organizations have a retention policy that defines the data lifecycle, and the data is deleted once its lifecycle is over.
- Real-time analytics and computing: To detect abnormal behavior and raise alerts based on the collected data or aggregation results, time-series data must be computed in real time.
- Query scope: Time-series data is always queried over a period of time or a set of data sources, and filters are used such that not all historical data is queried. In addition, data aggregation is always applied on all or a subset of the data sources with a filter condition.
- Trends: In time-series data, single data points are usually not important. Instead, the focus is on how data trends over a period of time, such as changes in the past hour or day.
Time-series solutions like TDengine optimize their design based on these characteristics, which enables more efficient processing of time-series data and better performance compared with general databases.
Why Does Time-Series Data Require Specialized Databases?
Today, everything is online — meters, cars, elevators, assembly lines, and even bicycles are connected to the Internet. And all of these items are emitting a relentless stream of metrics and events. With the advent of IoT and the cloud, the volume of time-series data has begun growing exponentially in an unprecedented way. The massive size of time-series data sets is a major challenge for general database management systems like relational and NoSQL databases. In particular, the following aspects of time-series data are difficult for non-specialized databases to handle:
- Data ingestion rate: In many time-series data scenarios, millions of data points are produced every second and need to be ingested in real time.
- Query latency: Time-series applications often need to scan a huge number of data points to get an aggregation result, which can result in high latency.
- Storage cost: Because relational and NoSQL databases cannot compress time-series data efficiently, storage costs can become very high very fast.
These issues mainly involve efficiency in processing large data sets, but there are also areas where general databases often do not support even the basic requirements of time-series applications, like data retention and specialized analytics. With general databases, developers are forced to write custom code to implement these features. Different data workloads require different database solutions — one size does not fit all. For time-series data, no matter the size of your data set, a purpose-built time-series database is the best tool for the job.
Learn More
If you’re interested in learning about time-series databases and how they might be a good fit for your organization’s data infrastructure, enter your email on the right and download our special report.