What Is a Data Historian?

Jeff Tao

August 10, 2023 / Engineering

In industrial scenarios, data collected from SCADA systems and other equipment is often stored in a specialized database known as a data historian. Similar to a time series database (TSDB), a historian is optimized for storing time series data: it can quickly ingest and query such data, though it lacks some of the features of general-purpose databases.

Data historian software was developed as a response to increasingly automated industrial scenarios, with devices collecting greater amounts of data and industry seeing the value of analyzing and processing their data. This article discusses the typical requirements of a traditional data historian and offers insight into the industrial data processing solutions that we expect to see going forward.

Typical Data Historian Requirements

There are a number of requirements for a traditional data historian to be effective in its role as the storage and processing center for industrial data. First, to ensure data transmission speed is adequate, historians need to be physically close to data collectors. At present, a data historian is normally deployed in each plant to collect the data for that plant, and there may be a centralized historian to collect data for the entire company as well.

Because historians are responsible for obtaining and storing the data collected by controllers, they need to include interfaces for any data protocols used by equipment in the plant. The protocols supported out of the box differ by vendor; PI System is an example of a product that offers rich protocol support for its historian. While supporting standard protocols such as OPC and Modbus may be sufficient for newer plants, brownfield deployment often requires communication over specific legacy protocols.

In addition to protocol support, historians need to be able to obtain data using a variety of methods. Polling — actively requesting data from devices on a predefined schedule — is a necessity for all historians. Some protocols also enable devices to transmit data to the historian in an event- or time-based manner, in which case the data historian must be prepared to receive and process such unsolicited transmissions.

Some additional features are required of historians to achieve high performance and support industrial applications, including the following:

Data filtering is often configured based on a deadband, which is a range around the previously collected value. A new value that does not exceed the deadband is considered insignificant and not collected by the historian.
Data compression is a key element of reducing storage size. Historians typically include a lossless compression engine and may offer lossy options as well.
Backfill and interpolation are essential features for ensuring that all necessary data points exist for applications or algorithms that require them.

The Future of Industrial Data

As digital transformations continue to occur across the industrial sector, legacy data historians are already becoming bottlenecks in the industrial data infrastructure. They are generally closed systems unable to adapt to modern technological concepts such as cloud computing and often tied to vendors that are averse to innovation. Because of this closed nature, sites that use traditional data historians often become silos whose data cannot be easily shared within the company, much less with third-party software and tools.

With leaders in industry looking to cutting-edge technology such as AI to improve operational efficiency and reduce costs, it is essential that industrial data platforms support open ecosystems and integration with the latest analytics and visualization tools. Traditional data historians are unable to meet these needs, and new solutions will be required in the near future.

TDengine: A Next Generation Data Historian

TDengine is a next generation data historian built to help industrial customers fully embrace the Industrial IoT. It can ingest, query, and compress large volumes of industrial data with higher performance than other systems and enables the centralization and sharing of data from disparate systems. With data centralization and high performance from TDengine, industrial enterprises finally have access to AI-enabled analytics and modern visualization tools.

In addition, TDengine includes a variety of connectors for industrial customers, including a connector for PI System, so that it is not necessary to rip and replace existing systems. Instead, TDengine can be deployed together with a traditional historian to provide additional services while retaining investment in existing infrastructure.

Jeff Tao
With over three decades of hands-on experience in software development, Jeff has had the privilege of spearheading numerous ventures and initiatives in the tech realm. His passion for open source, technology, and innovation has been the driving force behind his journey.

As one of the core developers of TDengine, he is deeply committed to pushing the boundaries of time series data platforms. His mission is crystal clear: to architect a high performance, scalable solution in this space and make it accessible, valuable and affordable for everyone, from individual developers and startups to industry giants.