Energizing Innovation: IIoT Data Challenges in Renewable Energy

Jim Fan

August 9, 2024 / ,

As the power industry continues its transition toward affordable and sustainable methods of electricity generation, renewable energy has become the fastest growing segment of the industry, and this trend is expected to persist for the foreseeable future. However, the rapid increase in capacity and new installations has brought certain challenges to renewables operators in storing and managing their data. To achieve their goals of optimizing production, minimizing cost per megawatt, and implementing cutting-edge applications like generative AI, operators require a data infrastructure that can handle the massive volumes of data being generated by Industrial IoT (IIoT) devices at power plants and farms.

Existing data solutions such as general-purpose database management systems and data warehouses may be sufficient for smaller sites, but as power capacity continues to increase and business scenarios become more complex, these data solutions inevitably fail to support the needs of renewable energy operators in terms of performance and price. Without a robust data infrastructure, modern applications like predictive maintenance and condition monitoring are not achievable, and operators are unable to benefit from the insights that IIoT data can offer.

TDengine is a time-series database designed and optimized for Industrial IoT scenarios. As a core component of the industrial data infrastructure, it includes all features that renewable energy operators need to store and process the data generated by their equipment, and it enables a wide variety of applications that maximize the value of that data. With its high performance and efficient storage, TDengine reduces data-related expenditures while ensuring low latency for data operations. And by delivering a comprehensive solution that covers all parts of the data management lifecycle — from ingestion and ETL, to contextualization and real-time processing, and finally to sharing with team members and third-party applications — TDengine makes operations data for the renewables industry accessible, valuable, and affordable.

Background

The power generation sector has been moving rapidly toward renewables in the past decade, not only for the environmental benefits but also for financial reasons. By 2023, renewable energy accounted for over 21% of electricity generation in the United States,1 and that number is only set to increase: according to the American Clean Power Association (ACP), 5,585 megawatts of new capacity were added in the first quarter of 2024, representing 28% growth year over year.2 This is not an outlier; in fact, as shown in the following figure, new capacity in renewables has outpaced fossil fuels in the United States every year since 2019.

Utility-scale power capacity additions. (ACP 2024)

Solar, wind, and battery storage have become driving forces in the transformation of the global power industry and are expected to comprise the majority of power capacity additions in the near future. With clean power procurement seeing significant growth as well, it’s clear that the renewable energy market is stronger than ever. But as new installations come online and operators handle increasingly large and complex deployments, ensuring that all sites work at peak efficiency has become more of a challenge. In particular, storing and processing the vast amounts of data required to optimize operations and enable cutting-edge applications is often overwhelming for data engineering teams and for the database management systems that they use.

Data Challenges

With the advent of the IIoT, devices are generating more data than ever before. The scale of this data enables a wealth of new possibilities; indeed, the remarkable achievements in artificial intelligence (AI) and machine learning (ML) in recent years are possible only because of the massive datasets available to these applications. However, not every data solution is capable of storing and managing IIoT data due to its velocity and scale. To get new insights and derive the most value from their data, renewable energy operators must overcome a number of challenges.

Data Volume and Velocity

Sites generate massive amounts of data from numerous sensors and monitoring devices. Handling this data requires robust storage solutions. Moreover, sites often generate data at high frequencies (e.g., every second or minute), leading to large volumes of data that need to be ingested, stored and managed efficiently.

Data Architecture Scalability

As the number of monitored assets increases, the database must scale seamlessly without degradation in performance. A distributed design is necessary to enable the horizontal scalability required to handle massive time-series datasets like those seen in the renewables industry.

Real-Time Processing and Analysis

Streaming data from sensors and devices must be handled and analyzed in real-time. This real-time data processing requires low-latency systems that can quickly ingest, process, and analyze data streams to provide immediate insights and actions, especially for condition monitoring. For example, to prevent fires, a battery management system must shut down power to a battery within 5 seconds after off-gassing is detected.

Data Consolidation and Consistency

Integrating time-series data from various sources and sites (e.g., inverters, weather stations, and SCADA systems) into a cohesive dataset can be complex, and ensuring good governance and high quality for these complex datasets is critical.

Data Compression and Efficiency

Data systems must ensure quick access and retrieval of relevant data subsets for analysis, especially for large datasets, but also provide scalable solutions to store years of historical data for trend analysis and predictive maintenance. Employing efficient data compression techniques is also critical to reduce storage requirements without losing important information.

Existing Solutions

Unlike traditional industries that have used purpose-built data systems for decades, the relatively new industry of renewable energy has not yet standardized around any particular data infrastructure. At present, sites may use any of the following systems:

  • Relational databases like MySQL and PostgreSQL

  • Traditional data historians like PI System and AVEVA Historian

  • Data warehouses like Snowflake and Databricks

  • General-purpose time-series databases like InfluxDB and Timescale

While all four categories can be used to store and process data, they each have drawbacks, especially as datasets grow in scale and deployments become more complex.

Relational databases seem viable in proof-of-concept scenarios but are actually ill-equipped to handle the volume of data being generated by IIoT devices. Operators find that they are constantly required to increase the hardware resources available to the database management system, resulting in skyrocketing costs. Because the horizontal scalability of traditional relational databases is lacking, extremely powerful and expensive servers are required even to attempt to handle operations data from solar plants or wind farms. In addition, time-series computations are not supported by the DBMS itself and instead must be performed by applications, greatly adding to the workload of development teams.

Data historians like those traditionally used in non-renewable power generation are also seen in the renewables sector, especially in larger enterprises that have different types of plants. Although historians deliver integrated solutions purpose-built for industrial scenarios, many historians in use today are built on outdated architectures that greatly limit scalability as well as operators’ ability to run modern applications. They are often closed systems without strong support for interoperability, meaning that enterprises have few options for sharing operations data with other parts of their data stack and have to resort to custom code or manual operations — even getting data into the cloud can be a challenge with traditional historians. And historians can be prohibitively expensive, charging per tag and requiring enterprises to renegotiate licenses just to add capacity.

Data warehouses are optimized for complex analytical queries but may introduce latency in processing and retrieving IIoT data, which can be problematic for time-sensitive applications. The cost of maintaining and scaling a data warehouse to handle IIoT data can be significant as well, including costs for hardware, software licenses, and skilled personnel. And data warehouses are typically cloud-only platforms that do not provide a solution for the edge.

Finally, while general-purpose time-series databases can theoretically handle IIoT workloads, it’s important to remember that these databases are designed for IT scenarios, not for industry. Ingesting data from industrial sources can be a challenge as these databases do not provide out-of-the-box connectors or ETL, and data contextualization capabilities are weak, if they exist at all. Typical time-series databases do not offer real-time analytics or data sharing functionality on their own, instead requiring the user to deploy third-party components to implement their core business requirements.

Why TDengine

TDengine is the only time-series database designed for the Industrial IoT, and it has been carefully architected such that it performs well with massive industrial datasets, is lightweight and easy to use even for non-technical teams, and provides a rich feature set including components that are critical for renewable energy use cases.

TDengine as the core component of the industrial data architecture

As shown in the figure, TDengine acts as a central repository for operations data, aggregating and storing the data from OPC, MQTT, traditional data historians like PI System, and other sources into a unified platform. TDengine’s natively distributed architecture provides a scalable solution for storing your data efficiently and affordably. It supports standard SQL with powerful time-series extensions so that you can analyze your data in real time, and also integrates seamlessly with a wide variety of tools like Seeq and Grafana. Finally, TDengine can stream data in real time to authorized consumers, easily distributing data internally and externally with fine-grained access controls.

Key Capabilities

  • With its distributed scalable architecture that grows together with your business, TDengine can store, process, and monitor petabytes of data per day from billions of data collectors and sensors, all while providing the split-second latency that your real-time visualization and reporting apps demand.

  • With its unique design and data model, TDengine provides the most cost-effective solution for storing your operational data, including tiered storage, S3, and industry-leading compression ratios, ensuring that you can get valuable business insights from your data without breaking the bank.

  • With built-in connectors for a wide variety of industrial sources — MQTT, Kafka, OPC, PI System, and more — TDengine delivers zero-code data ingestion and extract, transform, and load (ETL) in a centralized platform that acts as a single source of truth for your business.

  • With out-of-the-box data subscription, caching, and stream processing, TDengine is more than just a time-series database — it includes all key components needed for industrial data storage and processing built into a single product and accessible through familiar SQL statements.

Edge–Cloud Synchronization

By making use of TDengine‘s connectors, you can easily ingest data from each of your sites into TDengine instances deployed on the edge. All data being generated at a site can be collected in a single system even when multiple protocols are in use — just connect your MQTT brokers, OPC servers, and other data sources to TDengine.

TDengine as a cloud repository for edge data

You can then implement edge–cloud synchronization by configuring TDengine instances on the edge to send their data to a centralized TDengine deployment in the cloud. With TDengine as your central repository for operational data, shown in the figure, you can easily integrate visualization and business intelligence tools to build company-wide dashboards and reports. Your applications and algorithms have access to all data in real time, enabling global insights and efficiency without custom code or manual operations.

TDengine in the Field

TDengine is already in use at a number of leading solar, wind, and energy storage enterprises as the foundation of their industrial data infrastructure. Earlier this year, Mingyang Group, one of the world’s top wind operators, chose TDengine as the foundation for their smart wind energy system. TDengine’s high performance and scalability now enable Mingyang’s real-time monitoring and prediction applications for their wind turbines, delivering a wealth of data-driven possibilities.

  • Mingyang has over 15,000 wind turbines with hundreds of sensors generating data every second. That adds up to a massive scale of hundreds of millions of data points every day — which a six-node TDengine deployment can ingest and store with no performance deterioration.

  • Their databases now contain over 4 billion records with 700 columns each, but only occupy 24 TB on disk, coming in at a compression ratio of 10%.

  • Even on this massive scale, the complex aggregate queries that Mingyang uses to monitor turbine status information return in a fraction of a second.

Summary

From predictive maintenance and condition monitoring to AI and ML, the applications that you need to minimize cost per megawatt and optimize production all rely on large, high-quality datasets. By storing IIoT datasets in TDengine, renewable energy operators gain a solid foundation for enabling cutting-edge apps and unlocking the most value from their data.

Contact us or email business@tdengine.com today to speak with an account representative and learn how TDengine can help you overcome data challenges in your operations. Our team would be happy to arrange a demo for your specific industry segment or use case so that you can see the high performance and efficiency of TDengine for yourself.

  1. U.S. Energy Information Administration, Electric Power Monthly (June 2024), 16, table 1.1. ↩︎
  2. American Clean Power Association, Clean Power Quarterly Market Report (2024 Q1), 3. ↩︎
  • Jim Fan
    Jim Fan

    Jim Fan is the VP of Product at TDengine. With a Master's Degree in Engineering from the University of Michigan and over 12 years of experience in manufacturing and Industrial IoT spaces, he brings expertise in digital transformation, smart manufacturing, autonomous driving, and renewable energy to drive TDengine's solution strategy. Prior to joining TDengine, he worked as the Director of Product Marketing for PTC's IoT Division and Hexagon's Smart Manufacturing Division. He is currently based in California, USA.