Virtual Table: A Paradigm Shift in Industrial Data Governance

Joel Brass

May 21, 2025 / Engineering

Modern industrial device data is often collected as a table of four items — a tag name, metric, quality value, and timestamp — derived from the format of traditional industrial control protocols. This is seen when collecting data via OPC: the underlying data reported by the equipment is typically in this single-column format rather than the multi-column data model typical of IT devices, which include multiple metrics. However, this approach creates a disconnect between the database and applications that use its data, making it difficult to unify business logic with underlying data. In business queries, such as displaying a wide table with 30 columns, using SQL becomes extremely complex and often requires multiple JOIN operations.

To address this issue, TDengine has introduced an innovative bridging approach known as the virtual table. With a virtual table, there is no need for special modeling when aggregating industrial data: the data is collected and stored directly in a single-column model. Virtual tables can then be created and re-created to arrange the data as needed for evolving business requirements. Using an I/O point table and SQL statements, virtual subtables can be created in bulk, significantly reducing the workload involved in data collection, modeling, and query analysis.

The Challenge of Evolving Industrial Data Structures

Industrial control systems follow the international standard IEC 62264-1, which adopts a five-layer architecture model to describe the functional hierarchy of industrial automation systems. From I/O channel data collection at the device level, to multi-column models in MES systems, and eventually to multi-column table displays in ERP systems, the requirements for data organization evolve as you move up each layer.

At the I/O device layer, data is organized based on analog signal channels. Sensors detect measured variables such as temperature and pressure, convert them into analog signals (e.g. 4–20 mA or 0–5 V), and transmit them to the PLC or DCS systems on the process control layer. In these systems, analog signals are converted into computer-recognizable digital signals through analog-to-digital conversion. At this stage, data is still organized based on I/O channels.

To enable remote monitoring and control by the dispatch center, SCADA systems collect signals from controllers and bind them to graphical applications within the computer, creating real-time simulations through digitally driven animations. SCADA systems internally package functional modules related to measurement variables, such as reports and trend charts, making it easy to generate daily reports, monthly reports, and comparison curves. Some SCADA systems have also introduced the concept of data models; however, when forwarding data externally, they still primarily rely on the single-column model.

When data enters the MES or MOM systems at the production management level, the requirements shift to wide-table, multi-column models, where work orders, batch numbers, and process information need to be linked together. However, since data historians typically use a single-column model, and MES systems generally require multi-column tables, relational databases are often chosen for MES construction. Massive volumes of historical data must then be extracted from the historian.

To address the inconvenience of using multiple databases, some MES vendors have introduced time-series databases that support wide tables. However, this creates a new challenge in data collection and transformation: the transformation of single-column data models collected from the SCADA systems, controllers, or devices on the OT layer into the multi-column models required by MES systems on the IT layer.

Current solutions often rely on custom development to map single-column data into predefined wide-table formats. This approach incurs high costs due to customization, later-stage business changes, and ongoing maintenance. It also raises data integrity issues, such as data loss, expiration, or out-of-order writes.

A single production line often needs to manufacture different products and batches, while recording dozens of process parameters such as temperature, voltage, and current. These parameters must be linked with the product type and batch information to provide a data foundation for subsequent quality traceability and process analysis.

However, under traditional methods, operators need to query batch start times from a relational database and retrieve corresponding data from the metrics in the historian. This approach is not only complex but also poses significant challenges for later-stage management. If a factory has tens or hundreds of thousands of measurement points, even minor design flaws can lead to exponentially increasing difficulty in application development and system maintenance.

For an industrial data system to be effective, business applications cannot be tightly coupled to changes in the underlying. In a mining scenario, for example, when the face being mined changes, the underlying collection points may remain the same — only the identifier of the face changes.

If the traditional approach to data modeling is followed, creating a new table for each face could eventually lead to the database containing hundreds of millions of tables. This would place a heavy burden on both the database and on business applications. As the number of tables increases, the applications would need to remap them, which not only increases workload and risk of error but also makes historical data queries significantly more difficult.

Virtual Table: A Bridge from OT to IT

TDengine introduces the innovative concept of the virtual table, allowing flexible combinations of different measurement variables to generate the required table formats and eliminating the need for excessive concern over modeling and business applications.

With its zero-code connectors for industrial data protocols, TDengine can collect data from devices and store it into the database. As an example, three metrics collected over OPC are shown as follows:

Subtable	Timestamp	Metric	Quality	Location (Tag)	Type (Tag)
batchNo	‘2025/3/18 00:00:01.055	250303A001	192	Workshop 1	Batch
batchNo	‘2025/3/18 10:00:02.089	250303A002	192	Workshop 1	Batch
batchNo	‘2025/3/18 18:03:06.790	250303A003	192	Workshop 1	Batch
batchNo	…	…	…	…	…

Subtable	Timestamp	Metric	Quality	Location (Tag)	Type (Tag)
Temp	‘2025/3/18 00:00:01.050	25.12	192	Workshop 1	Thermocouple
Temp	‘2025/3/18 00:00:02.060	25.12	192	Workshop 1	Thermocouple
Temp	‘2025/3/18 00:00:03.070	25.11	192	Workshop 1	Thermocouple
Temp	…	…	…	…	…

Subtable	Timestamp	Metric	Quality	Location (Tag)	Type (Tag)
Current	‘2025/3/18 00:00:01.100	15.12	192	Workshop 1	Ammeter transformer
Current	‘2025/3/18 00:00:02.200	15.15	192	Workshop 1	Ammeter transformer
Current	‘2025/3/18 00:00:03.300	14.32	192	Workshop 1	Ammeter transformer
Current	…	…	…	…	…

When an application needs to query batch information for different products along with the corresponding process parameters, using these single-column tables can be difficult, but virtual tables simplify the process. Using the three metrics described above, you could implement the query in TDengine as follows:

Create a virtual table:

CREATE VTABLE v1（ts TIMESTAMP, batchNo.val, Temp.val, Current.val);

Query the aligned data in the virtual table:
```
SELECT LAST(*) FROM v1 INTERVAL(1s）FILL(PREV);
```
ts batchNo Temp Current

‘2025/3/18 00:00:02.000 250313A001 25.12 15.12

‘2025/3/18 00:00:03.000 250313A001 25.12 15.15

‘2025/3/18 00:00:04.000 250313A001 25.11 14.32

… … … …
Query the raw data in the virtual table:
```
SELECT * FROM v1;
```
ts batchNo Temp Current

‘2025/3/18 00:00:01.050 NULL 25.12 NULL

‘2025/3/18 00:00:01.055 250313A001 NULL NULL

‘2025/3/18 00:00:01.100 NULL NULL 14.32

… … … …

ts	batchNo	Temp	Current
‘2025/3/18 00:00:02.000	250313A001	25.12	15.12
‘2025/3/18 00:00:03.000	250313A001	25.12	15.15
‘2025/3/18 00:00:04.000	250313A001	25.11	14.32
…	…	…	…

ts	batchNo	Temp	Current
‘2025/3/18 00:00:01.050	NULL	25.12	NULL
‘2025/3/18 00:00:01.055	250313A001	NULL	NULL
‘2025/3/18 00:00:01.100	NULL	NULL	14.32
…	…	…	…

TDengine enables easy relational queries across different measurement points simply by creating virtual tables and using SQL statements. Compared with traditional methods, TDengine eliminates the need to construct complex JOIN queries through views or predefine data models. This avoids the cumbersome processes of data integration and transformation upfront, as well as the difficulties of modification and maintenance later on. By introducing the virtual table, TDengine simplifies and streamlines data organization, providing powerful data support for business applications while making operations and maintenance more convenient.

Conclusion

With the virtual table and other features designed for industry, TDengine is now able to act as a bridge between OT and IT scenarios. In the future, business applications will no longer be constrained by differences in format between OT data and IT business data, enabling smoother integration and application of data. As business scale continues to grow, TDengine’s ease of use will become increasingly evident, laying a solid data foundation for digital transformation and empowering enterprises on their path to Industry 4.0 and beyond.

Joel Brass
Joel Brass is a Solutions Architect at TDengine, bringing extensive experience in real-time data processing, time-series analytics, and full-stack development. With a 20 year background in software engineering and a deep focus on scalable applications and solutions, Joel has worked on a range of projects spanning joke databases, IoT, self-driving vehicles, and work management platforms. Prior to joining TDengine, Joel worked in Advisory Services for Enterprise customers of Atlassian and the Systems Engineering team at Waymo. He is currently based in the San Francisco Bay Area.