Time-Series Databases: A Game Changer for the Renewable Energy Industry

Jeff Tao

August 29, 2024 / Renewable Energy, Industrial Data

With significant increases in installed capacity every year, the renewable energy sector is experiencing a period of rapid development. According to the U.S. Energy Information Administration, operators intend to increase utility-scale capacity by 62.8 gigawatts in 2024, of which solar and battery storage will contribute 81%.¹ And the International Energy Agency (IEA), as shown in the following figure, predicts that worldwide solar photovoltaic (PV) and wind energy production will “more than double by 2028 compared with 2022,” reflecting strong and consistent growth across regions.²

Renewable electricity capacity additions by technology and segment. (IEA 2023)

But as enterprises add new sites and expand existing ones, they face a new challenge: how to manage the massive amounts of data that their growing deployments are generating.

Unlike the manufacturing sector, where data historians have been used for decades to manage large datasets, leaders in renewable energy have yet to converge on an optimal solution for their data management needs. Many newer enterprises simply choose a solution with which they are familiar — typically a popular open-source relational database like MySQL, a NoSQL key-value store like MongoDB, or a data warehouse like Snowflake. In the early stages, with a limited number of devices reporting data and few data-intensive applications, these solutions can indeed process time-series data acceptably.

However, as the business grows and the number of devices increases, the amount of data being ingested can quickly overwhelm these general-purpose systems. Processing limited data for a proof-of-concept (PoC) deployment is a completely different task from monitoring entire solar or wind farms in real time, and enterprises are forced constantly to purchase additional hardware or cloud resources and scale up their database systems just to maintain the status quo in terms of performance. As costs skyrocket and formerly simple systems grow into complex, unmanageable behemoths, it becomes clear to enterprises in the renewables industry that general-purpose databases are not ideal solutions for their operations data.

What Is a Time-Series Database?

A time-series database (TSDB) is a database management system that is optimized to store, process, and analyze time-series data: a sequence of data points that represent the changes of a measurement or events over a time period. The data generated by sensors on industrial equipment, smart devices, and IT monitoring systems are all examples of time-series data. In the renewable energy industry in particular, devices like solar panels, wind turbines, and energy management systems (EMS) generate huge numbers of metrics as time series.

Time-series data is often used to look for insights into operations, raise alerts based on real-time analysis, and forecast future trends; in fact, many cutting-edge industrial applications being deployed by renewables operators today, from predictive maintenance and remote monitoring to artificial intelligence (AI) and machine learning (ML), rely on large-scale, high-quality time-series datasets.

With the expansion of the renewables sector, the amount of time-series data generated at solar and wind farms as well as energy storage installations is growing to an extent that even traditional analysis is becoming difficult for legacy data historians and general-purpose databases. Instead, forward-thinking enterprises in the industry are increasingly adopting the purpose-built time-series database as the platform for processing their operations data.

By accounting for the characteristics of time-series data, purpose-built time-series databases are much more efficient in terms of ingestion rate, query latency, and data compression. In addition, time-series databases include special analytics functions and data management features that make application development easier.

Different data workloads require different database solutions — one size does not fit all. For most renewable energy operators and producers, a purpose-built time-series database is the best tool for processing operations data.

Why Does Time-Series Data Require Specialized Databases?

It is possible to process time-series data with general-purpose relational or NoSQL databases, but these systems are not optimized for this use case. With the advent of the cloud and the Industrial Internet of Things (IIoT), as the cost of communication continues to decrease and smart devices and sensors become commonplace, the volume of time-series data has begun growing exponentially across industries in an unprecedented way.

While general-purpose databases may seem to perform acceptably in the early stages of a project, as the business expands, they are inevitably overwhelmed by the sheer volume of time-series datasets. In order to make use of these invaluable datasets — to monitor devices, generate reports, trigger alarms, make predictions, and more — businesses need data platforms that can handle its scale.

In particular, the following three aspects of time-series data processing are difficult for non-specialized databases to handle:

Data ingestion rate: In many time-series data scenarios, millions of data points are produced every second and need to be ingested in real time. Relational databases are not designed to process this volume of data, and while NoSQL databases can be scaled to handle it, the amount of resources required quickly becomes prohibitively expensive.
Query latency: Time-series applications often need to scan a huge number of data points to get an aggregation result, which can result in high latency. With the suboptimal performance offered by general-purpose databases, aggregation results on large datasets are often outdated before they are even returned to the user.
Storage cost: Internet-connected devices and applications are generating data nonstop, sometimes exceeding a terabyte in a single day. Because relational and NoSQL databases cannot compress this data efficiently, storage costs can become high very fast.

Time-series databases offer significantly higher ingestion, query, and compression performance than other database management systems because they take advantage of the characteristics of modern time-series datasets discussed in the previous section. Although the specifics will differ by system, renewables operators can expect approximately 10 times faster performance and 5 times higher compression by replacing a general-purpose database with a purpose-built time-series database.

Using a general-purpose database for time-series data processing requires a significant amount of custom coding to suit the unique qualities of this data and its analysis. Although time-series databases may also incorporate custom code to meet the needs of an organization, the features required for time-series data are already built in to the system. These features often include the following:

Data lifecycle management: Time-series data is generally removed in bulk, not one data point at a time, as it ages out.
Rollup: In most cases, time-series data is collected and rolled up over a set period before being stored in the new table. Raw data and rolled-up data can have distinct life cycles and retention policies.
Special analytics functions: Relational and NoSQL databases typically lack the specialized analytical functions that are built-in to time-series databases. These include time-weighted average, moving average, cumulative sum, rate of change, elapsed time for a specific state, and the delta between two consecutive data points.
Interpolation: Time-series applications and algorithms often require data to be interpolated based on adjacent data points and specified rules so as to regularize data sets.
Windowed queries: Aggregations and analytical procedures may be performed on a session, state, or sliding window, not just time: for example, consider an application that calculates average power generation of wind turbines only when wind speed exceeds a certain minimum threshold.

Why Are Time-Series Databases a Game Changer?

Total Cost of Data Operations

Performance metrics like ingestion speed and query latency may seem abstract, but they have a direct effect on the capital and operational expenditures of an organization. If storage solutions are unable to keep up with the rate of data being generated by IIoT devices, valuable data can be lost forever, affecting the accuracy of analytics. And especially in critical areas like energy generation and distribution, internal and external customers cannot afford to wait for slow queries: imagine if an equipment fault required resolution within 5 minutes, but the system monitoring the equipment could only return reports every half hour.

When general-purpose databases are scaled to meet the needs of time-series applications, hardware and cloud resource costs can quickly spiral out of control. Specialized time-series databases significantly reduce these costs because they require fewer resources to deliver the same results — in some cases, dozens of servers can be replaced with fewer than 10 while providing equivalent or even superior ingestion and query performance. The reduction in hardware resources has a positive effect down the line on operational costs as well, because fewer personnel are required to manage the smaller number of databases and servers in a time-series deployment, not to mention savings on electrical bills and data center leasing or cloud service costs, among others.

While storage media have become less costly, the size and scope of time-series datasets — often spanning several years of data being collected at high frequencies — mean that data storage costs can become astronomical as the number of devices increases. Unlike general-purpose databases, specialized time-series databases can efficiently compress these datasets, often achieving 10:1 compression ratios. This is because time-series databases index data by timestamp and store it in a column-oriented manner, aligning datasets on disk for optimal compression.

Features and Applications

Another area in which time-series databases benefit organizations is in their feature sets. While general-purpose databases do provide simple functions like taking the sum or the average of a range of data, they lack the advanced functions — interpolation, downsampling, time-weighted average, cumulative sum — that are basic requirements for time-series analytics in the real world.

By building time-series-specific functionality into the database management system, time-series databases free enterprises from the burden of having to reimplement basic features in their applications. This is especially important for industrial enterprises like renewables operators that often do not have the luxury of large software development teams.

Additionally, the performance and functionality brought by time-series databases enables the renewable energy sector to implement new applications and derive more value from existing data. Real-time condition monitoring and predictive maintenance are examples of modern applications that have strict requirements on database performance, and it is unlikely that general-purpose databases, already struggling just to ingest IIoT data, would be able to support them.

Not All Time-Series Databases Are Created Equal

This is a pivotal time in the renewable energy industry, and the consequences of choices that enterprises make in these early stages can be far-reaching. Planning ahead and designing scalable, future-ready systems is essential to ensure that business development is not stymied by platforms and architectures that cannot support the exponential growth that the sector is expected to see. The critical role played by data in enabling that growth is already clear, and enterprises must be sure that their data architecture is up to the task.

The features and capabilities of time-series databases are highly complementary to the applications and use cases of contemporary renewables operators, and the ability of time-series data systems to adapt and scale guarantees that this will continue to be the case going forward. From predictive maintenance and condition monitoring to AI and ML, the applications that operators need to minimize cost per megawatt and optimize production all rely on large, high-quality IIoT datasets. By storing these datasets in a time-series database, renewable energy operators gain a solid foundation for enabling cutting-edge apps and unlocking the most value from their data.

TDengine is the only time-series database designed and optimized for IIoT applications, and renewable energy operators from wind giants like Mingyang to innovators in solar trackers like Nevados are already using TDengine to optimize their data workflows and provide value to their customers. Contact us or email business@tdengine.com today to speak with an account representative and learn how TDengine, the time-series database purpose-built for the IIoT, can help you overcome data challenges in your operations. Our team would be happy to arrange a demo for your specific industry segment or use case so that you can see the high performance and efficiency of TDengine for yourself.

U.S. Energy Information Administration, “Solar and battery storage to make up 81% of new U.S. electric-generating capacity in 2024,” https://www.eia.gov/todayinenergy/detail.php?id=61424. ↩︎
International Energy Agency, Renewables 2023: Analysis and forecast to 2028 (January 2024), 15. ↩︎

Jeff Tao
With over three decades of hands-on experience in software development, Jeff has had the privilege of spearheading numerous ventures and initiatives in the tech realm. His passion for open source, technology, and innovation has been the driving force behind his journey.

As one of the core developers of TDengine, he is deeply committed to pushing the boundaries of time series data platforms. His mission is crystal clear: to architect a high performance, scalable solution in this space and make it accessible, valuable and affordable for everyone, from individual developers and startups to industry giants.