Writing a Procedural Language Handler
All calls to functions that are written in a language other than the current “version 1” interface for compiled languages (this includes functions in user-defined procedural languages and functions written in SQL) go through a call handler function for the specific language. It is the responsibility of the call handler to execute the function in a meaningful way, such as by interpreting the supplied source text. This chapter outlines how a new procedural language's call handler can be written.
The call handler for a procedural language is a “normal” function that must be written in a compiled language such as C, using the version-1 interface, and registered with PostgreSQL as taking no arguments and returning the type language_handler. This special pseudo-type identifies the function as a call handler and prevents it from being called directly in SQL commands. For more details on C language calling conventions and dynamic loading, see Section Native Language Functions.
The call handler is called in the same way as any other function: It receives a pointer to a FunctionCallInfoBaseData struct containing argument values and information about the called function, and it is expected to return a Datum result (and possibly set the isnull field of the FunctionCallInfoBaseData structure, if it wishes to return an SQL null result). The difference between a call handler and an ordinary callee function is that the flinfo->fn_oid field of the FunctionCallInfoBaseData structure will contain the OID of the actual function to be called, not of the call handler itself. The call handler must use this field to determine which function to execute. Also, the passed argument list has been set up according to the declaration of the target function, not of the call handler.
It's up to the call handler to fetch the entry of the function from the
pg_proc system catalog and to analyze the argument and return types of the
called function. The AS clause from the CREATE FUNCTION command for the
function will be found in the prosrc column of the pg_proc row. This is
commonly source text in the procedural language, but in theory it could be
something else, such as a path name to a file, or anything else that tells the
call handler what to do in detail.
Often, the same function is called many times per SQL statement. A call handler can avoid repeated lookups of information about the called function by using the flinfo->fn_extra field. This will initially be NULL, but can be set by the call handler to point at information about the called function. On subsequent calls, if flinfo->fn_extra is already non-NULL then it can be used and the information lookup step skipped. The call handler must make sure that flinfo->fn_extra is made to point at memory that will live at least until the end of the current query, since an FmgrInfo data structure could be kept that long. One way to do this is to allocate the extra data in the memory context specified by flinfo->fn_mcxt; such data will normally have the same lifespan as the FmgrInfo itself. But the handler could also choose to use a longer-lived memory context so that it can cache function definition information across queries.
When a procedural-language function is invoked as a trigger, no arguments are passed in the usual way, but the FunctionCallInfoBaseData's context field points at a TriggerData structure, rather than being NULL as it is in a plain function call. A language handler should provide mechanisms for procedural-language functions to get at the trigger information.
A template for a procedural-language handler written as a C extension is provided below. This is a working sample demonstrating one way to create a procedural- language handler, process parameters, and return a value.
Although providing a call handler is sufficient to create a minimal procedural
language, there are two other functions that can optionally be provided to make
the language more convenient to use. These are a validator and an inline
handler. A validator can be provided to allow language-specific checking to be
done during [CREATE FUNCTION]. An inline handler can be provided to allow the
language to support anonymous code blocks executed via the [DO] command.
If a validator is provided by a procedural language, it must be declared as a
function taking a single parameter of type oid. The validator's result is
ignored, so it is customarily declared to return void. The validator will be
called at the end of a CREATE FUNCTION command that has created or updated a
function written in the procedural language. The passed-in OID is the OID of the
function's pg_proc row. The validator must fetch this row in the usual way,
and do whatever checking is appropriate. First, call CheckFunctionValidatorAccess()
to diagnose explicit calls to the validator that the user could not achieve
through CREATE FUNCTION. Typical checks then include verifying that the
function's argument and result types are supported by the language, and that the
function's body is syntactically correct in the language. If the validator finds
the function to be okay, it should just return. If it finds an error, it should
report that via the normal ereport() error reporting mechanism. Throwing an
error will force a transaction rollback and thus prevent the incorrect function
definition from being committed.
Validator functions should typically honor the check_function_bodies
parameter: if it is turned off then any expensive or context-sensitive checking
should be skipped. If the language provides for code execution at compilation
time, the validator must suppress checks that would induce such execution. In
particular, this parameter is turned off by qhb_dump so that it can load
procedural language functions without worrying about side effects or dependencies
of the function bodies on other database objects. (Because of this requirement,
the call handler should avoid assuming that the validator has fully checked the
function. The point of having a validator is not to let the call handler omit
checks, but to notify the user immediately if there are obvious errors in a
CREATE FUNCTION command.) While the choice of exactly what to check is mostly
left to the discretion of the validator function, note that the core CREATE FUNCTION code only executes SET clauses attached to a function when
check_function_bodies is on. Therefore, checks whose results might be
affected by GUC parameters definitely should be skipped when check_function_bodies
is off, to avoid false failures when restoring a dump.
If an inline handler is provided by a procedural language, it must be declared
as a function taking a single parameter of type internal. The inline handler's
result is ignored, so it is customarily declared to return void. The inline
handler will be called when a DO statement is executed specifying the procedural
language. The parameter actually passed is a pointer to an InlineCodeBlock
struct, which contains information about the DO statement's parameters, in
particular the text of the anonymous code block to be executed. The inline handler
should execute this code and return.
It's recommended that you wrap all these function declarations, as well as the
CREATE LANGUAGE command itself, into an extension so that a simple
CREATE EXTENSION command is sufficient to install the language. See Section
Packaging Related Objects into an Extension for information about writing
extensions.
The procedural languages included in the standard distribution are good references
when trying to write your own language handler. The [CREATE LANGUAGE] reference
page also has some useful details.
Example of the Handler Writing
/*
* Handler for the PL/Sample procedural language
*/
#include "qhb.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
#include "commands/event_trigger.h"
#include "commands/trigger.h"
#include "executor/spi.h"
#include "funcapi.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
PG_MODULE_MAGIC;
PG_FUNCTION_INFO_V1(plsample_call_handler);
static Datum plsample_func_handler(PG_FUNCTION_ARGS);
static HeapTuple plsample_trigger_handler(PG_FUNCTION_ARGS);
/*
* Handle function, procedure, and trigger calls.
*/
Datum
plsample_call_handler(PG_FUNCTION_ARGS)
{
Datum retval = (Datum) 0;
/*
* Many languages will require cleanup that happens even in the event of
* an error. That can happen in the PG_FINALLY block. If none is needed,
* this PG_TRY construct can be omitted.
*/
PG_TRY();
{
/*
* Determine if called as function or trigger and call appropriate
* subhandler.
*/
if (CALLED_AS_TRIGGER(fcinfo))
{
/*
* This function has been called as a trigger function, where
* (TriggerData *) fcinfo->context includes the information of the
* context.
*/
retval = PointerGetDatum(plsample_trigger_handler(fcinfo));
}
else if (CALLED_AS_EVENT_TRIGGER(fcinfo))
{
/*
* This function is called as an event trigger function, where
* (EventTriggerData *) fcinfo->context includes the information
* of the context.
*
* TODO: provide an example handler.
*/
}
else
{
/* Regular function handler */
retval = plsample_func_handler(fcinfo);
}
}
PG_FINALLY();
{
}
PG_END_TRY();
return retval;
}
/*
* plsample_func_handler
*
* Function called by the call handler for function execution.
*/
static Datum
plsample_func_handler(PG_FUNCTION_ARGS)
{
HeapTuple pl_tuple;
Datum ret;
char *source;
bool isnull;
FmgrInfo *arg_out_func;
Form_pg_type type_struct;
HeapTuple type_tuple;
Form_pg_proc pl_struct;
volatile MemoryContext proc_cxt = NULL;
Oid *argtypes;
char **argnames;
char *argmodes;
char *proname;
Form_pg_type pg_type_entry;
Oid result_typioparam;
Oid prorettype;
FmgrInfo result_in_func;
int numargs;
/* Fetch the function's pg_proc entry. */
pl_tuple = SearchSysCache1(PROCOID,
ObjectIdGetDatum(fcinfo->flinfo->fn_oid));
if (!HeapTupleIsValid(pl_tuple))
elog(ERROR, "cache lookup failed for function %u",
fcinfo->flinfo->fn_oid);
/*
* Extract and print the source text of the function. This can be used as
* a base for the function validation and execution.
*/
pl_struct = (Form_pg_proc) GETSTRUCT(pl_tuple);
proname = pstrdup(NameStr(pl_struct->proname));
ret = SysCacheGetAttr(PROCOID, pl_tuple, Anum_pg_proc_prosrc, &isnull);
if (isnull)
elog(ERROR, "could not find source text of function \"%s\"",
proname);
source = DatumGetCString(DirectFunctionCall1(textout, ret));
ereport(NOTICE,
(errmsg("source text of function \"%s\": %s",
proname, source)));
/*
* Allocate a context that will hold all the Postgres data for the
* procedure.
*/
proc_cxt = AllocSetContextCreate(TopMemoryContext,
"PL/Sample function",
ALLOCSET_SMALL_SIZES);
arg_out_func = (FmgrInfo *) palloc0(fcinfo->nargs * sizeof(FmgrInfo));
numargs = get_func_arg_info(pl_tuple, &argtypes, &argnames, &argmodes);
/*
* Iterate through all of the function arguments, printing each input
* value.
*/
for (int i = 0; i < numargs; i++)
{
Oid argtype = pl_struct->proargtypes.values[i];
char *value;
type_tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(argtype));
if (!HeapTupleIsValid(type_tuple))
elog(ERROR, "cache lookup failed for type %u", argtype);
type_struct = (Form_pg_type) GETSTRUCT(type_tuple);
fmgr_info_cxt(type_struct->typoutput, &(arg_out_func[i]), proc_cxt);
ReleaseSysCache(type_tuple);
value = OutputFunctionCall(&arg_out_func[i], fcinfo->args[i].value);
ereport(NOTICE,
(errmsg("argument: %d; name: %s; value: %s",
i, argnames[i], value)));
}
/* Type of the result */
prorettype = pl_struct->prorettype;
ReleaseSysCache(pl_tuple);
/*
* Get the required information for input conversion of the return value.
*
* If the function uses VOID as result, it is better to return NULL.
* Anyway, let's be honest. This is just a template, so there is not much
* we can do here. This returns NULL except if the result type is text,
* where the result is the source text of the function.
*/
if (prorettype != TEXTOID)
PG_RETURN_NULL();
type_tuple = SearchSysCache1(TYPEOID,
ObjectIdGetDatum(prorettype));
if (!HeapTupleIsValid(type_tuple))
elog(ERROR, "cache lookup failed for type %u", prorettype);
pg_type_entry = (Form_pg_type) GETSTRUCT(type_tuple);
result_typioparam = getTypeIOParam(type_tuple);
fmgr_info_cxt(pg_type_entry->typinput, &result_in_func, proc_cxt);
ReleaseSysCache(type_tuple);
ret = InputFunctionCall(&result_in_func, source, result_typioparam, -1);
PG_RETURN_DATUM(ret);
}
/*
* plsample_trigger_handler
*
* Function called by the call handler for trigger execution.
*/
static HeapTuple
plsample_trigger_handler(PG_FUNCTION_ARGS)
{
TriggerData *trigdata = (TriggerData *) fcinfo->context;
char *string;
volatile HeapTuple rettup;
HeapTuple pl_tuple;
Datum ret;
char *source;
bool isnull;
Form_pg_proc pl_struct;
char *proname;
int rc PG_USED_FOR_ASSERTS_ONLY;
/* Make sure this is being called from a trigger. */
if (!CALLED_AS_TRIGGER(fcinfo))
elog(ERROR, "not called by trigger manager");
/* Connect to the SPI manager */
if (SPI_connect() != SPI_OK_CONNECT)
elog(ERROR, "could not connect to SPI manager");
rc = SPI_register_trigger_data(trigdata);
Assert(rc >= 0);
/* Fetch the function's pg_proc entry. */
pl_tuple = SearchSysCache1(PROCOID,
ObjectIdGetDatum(fcinfo->flinfo->fn_oid));
if (!HeapTupleIsValid(pl_tuple))
elog(ERROR, "cache lookup failed for function %u",
fcinfo->flinfo->fn_oid);
/*
* Code Retrieval
*
* Extract and print the source text of the function. This can be used as
* a base for the function validation and execution.
*/
pl_struct = (Form_pg_proc) GETSTRUCT(pl_tuple);
proname = pstrdup(NameStr(pl_struct->proname));
ret = SysCacheGetAttr(PROCOID, pl_tuple, Anum_pg_proc_prosrc, &isnull);
if (isnull)
elog(ERROR, "could not find source text of function \"%s\"",
proname);
source = DatumGetCString(DirectFunctionCall1(textout, ret));
ereport(NOTICE,
(errmsg("source text of function \"%s\": %s",
proname, source)));
/*
* We're done with the pg_proc tuple, so release it. (Note that the
* "proname" and "source" strings are now standalone copies.)
*/
ReleaseSysCache(pl_tuple);
/*
* Code Augmentation
*
* The source text may be augmented here, such as by wrapping it as the
* body of a function in the target language, prefixing a parameter list
* with names like TD_name, TD_relid, TD_table_name, TD_table_schema,
* TD_event, TD_when, TD_level, TD_NEW, TD_OLD, and args, using whatever
* types in the target language are convenient. The augmented text can be
* cached in a longer-lived memory context, or, if the target language
* uses a compilation step, that can be done here, caching the result of
* the compilation.
*/
/*
* Code Execution
*
* Here the function (the possibly-augmented source text, or the result of
* compilation if the target language uses such a step) should be
* executed, after binding values from the TriggerData struct to the
* appropriate parameters.
*
* In this example we just print a lot of info via ereport.
*/
PG_TRY();
{
ereport(NOTICE,
(errmsg("trigger name: %s", trigdata->tg_trigger->tgname)));
string = SPI_getrelname(trigdata->tg_relation);
ereport(NOTICE, (errmsg("trigger relation: %s", string)));
string = SPI_getnspname(trigdata->tg_relation);
ereport(NOTICE, (errmsg("trigger relation schema: %s", string)));
/* Example handling of different trigger aspects. */
if (TRIGGER_FIRED_BY_INSERT(trigdata->tg_event))
{
ereport(NOTICE, (errmsg("triggered by INSERT")));
rettup = trigdata->tg_trigtuple;
}
else if (TRIGGER_FIRED_BY_DELETE(trigdata->tg_event))
{
ereport(NOTICE, (errmsg("triggered by DELETE")));
rettup = trigdata->tg_trigtuple;
}
else if (TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event))
{
ereport(NOTICE, (errmsg("triggered by UPDATE")));
rettup = trigdata->tg_trigtuple;
}
else if (TRIGGER_FIRED_BY_TRUNCATE(trigdata->tg_event))
{
ereport(NOTICE, (errmsg("triggered by TRUNCATE")));
rettup = trigdata->tg_trigtuple;
}
else
elog(ERROR, "unrecognized event: %u", trigdata->tg_event);
if (TRIGGER_FIRED_BEFORE(trigdata->tg_event))
ereport(NOTICE, (errmsg("triggered BEFORE")));
else if (TRIGGER_FIRED_AFTER(trigdata->tg_event))
ereport(NOTICE, (errmsg("triggered AFTER")));
else if (TRIGGER_FIRED_INSTEAD(trigdata->tg_event))
ereport(NOTICE, (errmsg("triggered INSTEAD OF")));
else
elog(ERROR, "unrecognized when: %u", trigdata->tg_event);
if (TRIGGER_FIRED_FOR_ROW(trigdata->tg_event))
ereport(NOTICE, (errmsg("triggered per row")));
else if (TRIGGER_FIRED_FOR_STATEMENT(trigdata->tg_event))
ereport(NOTICE, (errmsg("triggered per statement")));
else
elog(ERROR, "unrecognized level: %u", trigdata->tg_event);
/*
* Iterate through all of the trigger arguments, printing each input
* value.
*/
for (int i = 0; i < trigdata->tg_trigger->tgnargs; i++)
ereport(NOTICE,
(errmsg("trigger arg[%i]: %s", i,
trigdata->tg_trigger->tgargs[i])));
}
PG_CATCH();
{
/* Error cleanup code would go here */
PG_RE_THROW();
}
PG_END_TRY();
if (SPI_finish() != SPI_OK_FINISH)
elog(ERROR, "SPI_finish() failed");
return rettup;
}