Skip to content

Latest commit

 

History

History
57 lines (37 loc) · 3.59 KB

procedure.md

File metadata and controls

57 lines (37 loc) · 3.59 KB

Writing the component procedure

The component's logic should be implemented in the fullrun.sql and dryrun.sql files.

Dry runs

Workflows needs to perform a dry-run query before the actual execution of the workflow, in order to determine the resulting schema of each node.

For this, we need to create a dryrun.sql file that generates an empty table with the same schema as the actual component's result (defined in fullrun.sql) and returning 0 rows.

💡 Tip

The dry run code doesn't really need to be exactly the same as the full run. Functions that take longer to run can be avoided as long as the resulting schema is the same. For example, using "uuid_string" AS uuid" generates the same schema as "GENERATE_UUID() AS uuid" would generate.

Below you can see an example of a stored procedure built following the approach defined above. This procedure takes a table and generates a new one that includes and additional column with a unique identifier.

fullrun.sql

        EXECUTE IMMEDIATE '''
        CREATE TABLE IF NOT EXISTS ''' || output || '''
        OPTIONS (expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 30 DAY))
        AS SELECT *, GENERATE_UUID() AS uuid
        FROM ''' || input || ';';

dryrun.sql

        EXECUTE IMMEDIATE '''
        CREATE TABLE IF NOT EXISTS ''' || output || '''
        OPTIONS (
                expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 30 DAY))
        AS SELECT *, "uuid_string" AS uuid
        FROM ''' || input || '''
        WHERE 1 = 0;
        ''';

Variables

For implementing the logic of the component, you will have available a set of variable matching the names of the parameters declared in the metadata.json file, with both input and output parameters. These variables will contain the values selected by the user when configuring the component as part of a workflow.

If you have declared environment variables in your component metadata (see here), they will also be created and they will be available to be used from your code.

The input and output variables will contain the names of the input table that was configured for the component, and the name of the output one generated by the component. Parameters with those same names should have been declared in the corresponding metadata.jsonfile.

Do not generate tables with names others than the ones provided in the variables corresponding to output parameters. Otherwise, those tables will not be used when the component output is connected to another component in the workflow.

Table names and API execution

When a workflow is run from the Workflows UI, table names of the tables created by its components are fully qualified. That means that, if your custom component is connected to an upstream components and uses a table from it, the name that it will received in the corresponding parameter will be a FQN (that is, in the form project.dataset.table). Output names received in the output parameters will also be FQNs.

However, when the workflow is run as a stored procedure (when exported or when executed via API), all tables created in components are session tables that are single names (that is, something like tablename instead of project.dataset.table). That means that inputs that come from other components, and also output table names, will be single-name tables.

You should prepare your component to deal with this situation. Check the input/output table names to see whether they are fully-qualified or not, and implement the corresponding logic to run in each case.