DJI Automotive, a subsidiary of industry-leading drone manufacturer DJI, announced its entrance into the field of autonomous driving last April. Considering the high frequency at which its smart cars report data back to the system, DJI Automotive quickly realized that a time-series database (TSDB) management system would be necessary to process this massive set of autonomous driving data.
After thorough research, DJI Automotive determined that a suitable database management system (DBMS) would need to meet the following criteria:
- The DBMS must be able to store tens of millions of records per table in order to contain the massive amounts of autonomous driving data that DJI Automotive will generate.
- The DBMS must be able to quickly filter this huge data set by means of aggregate functions.
- The DBMS must support clustering for high availability and backup functionality.
- The DBMS must use an industry-standard query language so that technical personnel can use the system without special training.
With its relatively recent entrance into the market, DJI Automotive was not particularly burdened by legacy data, and all database solutions were on the table. In the end, DJI Automotive chose TDengine to process its time-series data.
In addition to meeting the four criteria listed above, TDengine also provides strong compression capabilities, enables high concurrency, and reduces O&M costs. It is open-source software and even its community edition supports clustering. Finally, the TDengine concept of “one table per device” provides a database structure that is particularly suited to DJI Automotive – the data for each vehicle is stored in a separate table.
Table Creation in TDengine
In DJI Automotive’s solution, a cloud platform monitors vehicle status information, such as GPS coordinates, speed, RPM, and mileage. This information is streamed over MQTT to a TDengine cluster, where it is stored for real-time monitoring or for replaying at a later date.
The following is an example of the data collected:
{"message_id": "a78b6d9a","device_key": "deviceKey2","ts": "2022-03-01 15:01:59","longitude": 123.9795647512915,"latitude": 21.58338210717887,"altitude": 51.47800064086914,"signal_strength": 12,"satellites_in_view": 21,"speed": 72.798225,"acceleration": 12,"rpm":2190,"gear": "D","direction": -91.32959,"mileage": 10020,"ip": "10.1.2.3","create_time": "2022-03-01 15:02:03",}
To store this data, DJI Automotive defined two supertables: device_stat
to store metrics reported by vehicles and mqtt_msg
to store MQTT messages.
The structure of the mqtt_msg
supertable is shown in the following figure.
For each vehicle, one subtable is created in device_stat
and one in mqtt_msg
to hold all of the data for that vehicle. These sub-tables are created using the DeviceKey
field as a unique identifier, for example device_stat_$deviceKey
.
Architecture and Migration
At present, DJI Automotive is running TDengine 2.2.1.3 on a single machine. The TDengine deployment is described in the following two figures.
The overall system and the components that work together with TDengine are shown in the following figure.
In this architecture, vehicles send their information to the MQTT broker. The application also sends commands to vehicles through the MQTT broker. The MQTT streams sent between devices and the cloud and between the cloud and the application are forwarded to a Kafka message queue, from which they are consumed by the business system. The system resolves these messages into a TraceID and a message ID, version, type, timestamp, and body. All of these items are stored in the mqtt_msg supertable in TDengine.
For the time being, data is written into both TDengine and MySQL databases, which map to each other. In this way, historical data is migrated to TDengine by means of log replay. The two databases act as backups for each other in the present architecture, but once the migration to TDengine has been completed, a TDengine cluster with multiple nodes will be used for backups.
The TDengine team indicated that data could also be migrated in CSV format or by using the open-source database migration tool DataX. The TDengineReader
and TDengineWriter
plug-ins for DataX make migrating from a relational database to TDengine a simple process of modifying JSON configuration files as appropriate.
Performance
TDengine has made querying real-time status information a much simpler process. DJI Automotive commonly uses the following SQL statements to obtain needed information:
- To query the latest position reported by a single vehicle:
select last_row(*) from device_stat_deJgTAEzInsZeGLM\G;
- To query the latest position reported by multiple vehicles:
select last_row(*) from device_stat where device_key in ('mpVOGpaHqAxGiHWo','HEChzTCZeIWSUysB','HgsIdzvJPeFlVDuT','LVaPHOXkEeTGjTpm','PFHnQCkcXCIBnbsC') group by device_key;
In TDengine, it is very easy to query data for a period of time in the past and analyze it on the front end:select * from device_stat_mpVOGpaHqAxGiHWo where ts >'2022-03-17 00:00:00' and ts < '2022-03-18 00:00:00';
- To trace an MQTT stream (by querying the latest message received by the MQTT broker):
select last_row(*) from mqtt_msg\G;
- To trace information based on the request ID:
select * from mqtt_msg where request_id = 'f90c46d4-22a3-4ab9-b50a-aad8b237fc57'\G;
- To query information based on time:
select * from mqtt_msg where ts >'2022-03-18 12:00:00' and ts < '2022-03-18 13:00:00';
These examples show that TDengine has implemented lightweight querying for the types of data that this use case required. The results of these queries were all returned in milliseconds, and even a query of over 30,000 records returned results in only 1.1 ms.
Conclusion
DJI Automotive chose TDengine as the time-series database management system for its autonomous driving cloud platform and has been completely satisfied with its experience. TDengine has reduced storage and training costs at DJI Automotive while also showing excellent read and write performance.
In the future, as DJI Automotive continues to research and explore time-series and spatial data, it is hoped that TDengine can be further improved to include the following:
- Better support and new features for reading and writing spatial data
- More authentication and authorization mechanisms for finer-grained permissions management
- More systematic logging options for better troubleshooting