TDengine Concepts: User-Defined Functions (UDF)

Avatar
TDengine Team
/

The built-in functions of TDengine may not be sufficient for the use cases of every application. In this case, you can define custom functions for use in TDengine queries. These are known as user-defined functions (UDF). A user-defined function takes one column of data or the result of a subquery as its input.

TDengine supports user-defined functions written in C or C++. This document describes the usage of user-defined functions.

User-defined functions can be scalar functions or aggregate functions. Scalar functions, such as abs, sin, and concat, output a value for every row of data. Aggregate functions, such as avg and max output one value for multiple rows of data.

When you create a user-defined function, you must implement standard interface functions:

  • For scalar functions, implement the scalarfn interface function.
  • For aggregate functions, implement the aggfn_start, aggfn, and aggfn_finish interface functions.
  • To initialize your function, implement the udf_init function. To terminate your function, implement the udf_destroy function.

Interface Functions

There are strict naming conventions for interface functions. The names of the start, finish, init, and destroy interfaces must be <udf-name>_start, <udf-name>_finish, <udf-name>_init, and <udf-name>_destroy, respectively. Replace scalarfn, aggfn, and udf with the name of your user-defined function.

Interface functions return a value that indicates whether the operation was successful. If an operation fails, the interface function returns an error code. Otherwise, it returns TSDB_CODE_SUCCESS. The error codes are defined in taoserror.h and in the common API error codes in taos.h. For example, TSDB_CODE_UDF_INVALID_INPUT indicates invalid input. TSDB_CODE_OUT_OF_MEMORY indicates insufficient memory.

For information about the parameters for interface functions, see Data Model

Compile UDF

To use your user-defined function in TDengine, first compile it to a dynamically linked library (DLL).

For example, the sample UDF bit_and.c can be compiled into a DLL as follows:

gcc -g -O0 -fPIC -shared bit_and.c -o libbitand.so

The generated DLL file libbitand.so can now be used to implement your function. Note: GCC 7.5 or later is required.

Manage and Use User-Defined Functions

Create UDF

SQL command can be executed on the host where the generated UDF DLL resides to load the UDF DLL into TDengine. This operation cannot be done through REST interface or web console. Once created, any client of the current TDengine can use these UDF functions in their SQL commands. UDF are stored in the management node of TDengine. The UDFs loaded in TDengine would be still available after TDengine is restarted.

When creating UDF, the type of UDF, i.e. a scalar function or aggregate function must be specified. If the specified type is wrong, the SQL statements using the function would fail with errors. The input data type and output data type must be consistent with the UDF definition.

Create Scalar Function

CREATE FUNCTION function_name AS library_path OUTPUTTYPE output_type;
  • function_name: The scalar function name to be used in SQL statement which must be consistent with the UDF name and is also the name of the compiled DLL (.so file).
  • library_path: The absolute path of the DLL file including the name of the shared object file (.so). The path must be quoted with single or double quotes.
  • output_type: The data type of the results of the UDF.

For example, the following SQL statement can be used to create a UDF from libbitand.so.

CREATE FUNCTION bit_and AS "/home/taos/udf_example/libbitand.so" OUTPUTTYPE INT;

Create Aggregate Function

CREATE AGGREGATE FUNCTION function_name AS library_path OUTPUTTYPE output_type [ BUFSIZE buffer_size ];
  • function_name: The aggregate function name to be used in SQL statement which must be consistent with the udfNormalFunc name and is also the name of the compiled DLL (.so file).
  • library_path: The absolute path of the DLL file including the name of the shared object file (.so). The path must be quoted with single or double quotes.
  • output_type: The output data type, the value is the literal string of the supported TDengine data type.
  • buffer_size: The size of the intermediate buffer in bytes. This parameter is optional.

For example, the following SQL statement can be used to create a UDF from libl2norm.so.

CREATE AGGREGATE FUNCTION l2norm AS "/home/taos/udf_example/libl2norm.so" OUTPUTTYPE DOUBLE bufsize 8;

Manage UDF

The following statement deleted the specified user-defined function.

DROP FUNCTION function_name;
  • function_name: The value of function_name in the CREATE statement used to import the UDF for example bit_and or l2norm.

Show Available UDF

SHOW FUNCTIONS;

Call UDF

The function name specified when creating UDF can be used directly in SQL statements, just like builtin functions. For example:

SELECT bit_and(c1,c2) FROM table;

The above SQL statement invokes function X for column c1 and c2 on table. You can use query keywords like WHERE with user-defined functions.

For more information about user-defined functions, see User-Defined Functions.