TSDB User-Defined Functions in C: Development Guide

Juno Qiu

June 29, 2026 /

UDF overview and classification

A time-series database (TSDB) can support User-Defined Functions (UDFs) for specialized computation that is not covered by built-in functions. Once registered in the cluster, UDFs can be called directly in SQL like native functions.

UDFs fall into two types:

  • Scalar functions: output one value per input row, such as data type conversions or mathematical operations
  • Aggregate functions: output one value across multiple input rows, such as SUM or AVG

UDFs support two programming languages: C and Python. UDFs written in C can deliver performance close to built-in functions, making them a good choice for performance-sensitive scenarios. Python UDFs can draw on Python’s library ecosystem to implement complex algorithms quickly.

Process isolation for safety

To prevent UDF execution anomalies from affecting the database service, the system uses process isolation: UDF execution runs in a separate process. If UDF code leaks memory or crashes, process isolation helps protect the core database service and overall system stability.

C language UDF interface specification

Scalar function interfaces require implementing the scalarfn interface function. It receives one row of input data and returns one output value. Developers only need to focus on the computation logic; the framework handles data reading and writing automatically.

Aggregate function interfaces require three interface functions that form the complete aggregation lifecycle:

  • aggfn_start: initializes the aggregation state, called at the start of aggregation
  • aggfn: processes each input row and updates the aggregation state
  • aggfn_finish: outputs the final aggregation result

The initialization and cleanup lifecycle also requires two functions:

  • udf_init: called when the UDF is loaded
  • udf_destroy: called when the UDF is unloaded, for resource cleanup

Compilation and deployment

After writing the C language UDF source code, compile it into a shared library. Example compilation command:

gcc -g -O0 -fPIC -shared bit_and.c -o libbitand.so

Compilation flags explained:

  • -g: generate debug information
  • -O0: disable optimization, useful for debugging
  • -fPIC: generate position-independent code (required for shared libraries)
  • -shared: produce a shared library file

Use GCC 7.5 or later for compatibility. After compilation, deploy the .so file to the specified path on the database server.

Registration and usage

After deploying the shared library, register the UDF with a SQL statement:

CREATE AGGREGATE FUNCTION max_vol AS '/root/udf/libmaxvol.so' OUTPUTTYPE BINARY(64) BUFSIZE 10240 LANGUAGE 'C'

The registration statement specifies the function name max_vol (the name used in SQL queries), the library file path, the output type BINARY(64), the buffer size 10240, and the programming language C. Once registered, the UDF can be called in SQL queries like a built-in function.

Summary

The UDF mechanism gives TDengine TSDB a flexible extension path for specialized computation. C UDFs are suited to performance-sensitive logic, while process isolation helps reduce risk to the core database service. The workflow from interface implementation to compilation, deployment, registration, and invocation gives developers a clear path for adding custom SQL functions.