What Is a Data Historian?

Sean Ely
Sean Ely

In industrial scenarios, data collected from SCADA systems and other equipment is often stored in a specialized database known as a data historian. Similar to a time series database (TSDB), a historian is optimized for storing time series data: it can quickly ingest and query such data, though it lacks some of the features of general-purpose databases.

The data historian was developed as a response to increasingly automated industrial scenarios, with devices collecting greater amounts of data and industry seeing the value of analyzing and processing their data. This article discusses the typical requirements of a data historian today and offers insight into the industrial data processing solutions that we expect to see going forward.

Typical Data Historian Requirements

There are a number of requirements for a modern data historian to be effective in its role as the storage and processing center for industrial data. First, to ensure data transmission speed is adequate, historians need to be physically close to data collectors. At present, a data historian is normally deployed in each plant to collect the data for that plant, and there may be a centralized historian to collect data for the entire company as well.

Because historians are responsible for obtaining and storing the data collected by controllers, they need to include interfaces for any data protocols used by equipment in the plant. The protocols supported out of the box differ by vendor; PI System is an example of a product that offers rich protocol support for its historian. While supporting standard protocols such as OPC and Modbus may be sufficient for newer plants, brownfield deployment often requires communication over specific legacy protocols.

In addition to protocol support, historians need to be able to obtain data using a variety of methods. Polling – actively requesting data from devices on a predefined schedule – is a necessity for all historians. Some protocols also enable devices to transmit data to the historian in an event- or time-based manner, in which case the data historian must be prepared to receive and process such unsolicited transmissions.

A number of additional features are required of historians to achieve high performance and support industrial applications.

  • Data filtering is often configured based on a deadband, which is a range around the previously collected value. A new value that does not exceed the deadband is considered insignificant and not collected by the historian.
  • Data compression is a key element of reducing storage size. Historians typically include a lossless compression engine and may offer lossy options as well.
  • Backfill and interpolation are essential features for ensuring that all necessary data points exist for applications or algorithms that require them.

The Future of Industrial Data

As digital transformations continue to occur across the industrial sector, legacy data historians are likely to become bottlenecks in industrial data systems. They are generally closed systems unable to adapt to modern technological innovations such as cloud computing and often tied to vendors that are averse to innovation. Going forward, the cloud-native time series database (TSDB) is likely to become the new core component of industrial data systems, either integrating with historians or replacing them completely.

Purpose-built time series databases offer the specialized data processing and high performance of traditional historians, but their open, cloud-oriented architectures enable expanded data sharing and access to cutting-edge analytics and visualization tools at a lower cost than historian-based solutions. To achieve success in the contemporary industrial landscape, enterprises large and small must begin considering their options for new data solutions or risk limiting business growth due to outdated systems.

For an example of a product combining a traditional historian with a cloud-native time series database, see the TDengine for PI System solution – a fully integrated hybrid system that retains the advantages of PI System while scaling and offering new features based on TDengine.


  • Sean Ely
    Sean Ely

    Sean Ely is Head of Product at TDengine, focused on making TDengine the best time-series database for Industrial IoT. He has spent over a decade working with time-series data, starting as an integrator working on industrial controls and later establishing himself as a subject matter expert in emerging technology, innovation, and data science for the energy industry.