CREATE TASK

Documentation

VoltDB Home » Documentation » Using VoltDB

CREATE TASK

CREATE TASK — Schedules a procedure to run periodically.

Synopsis

CREATE TASK task-name
ON SCHEDULE {CRON cron-definition | DELAY time-interval | EVERY time-interval | FROM CLASS class-path}
PROCEDURE { procedure-name | FROM CLASS class-path } [WITH (argument [,...])]
[ON ERROR {LOG | IGNORE | STOP} ]
[RUN ON {DATABASE | HOSTS | PARTITIONS} ]
[AS USER user-name]
[ENABLE | DISABLE]

CREATE TASK task-name
FROM CLASS class-path [WITH (argument[,...])]
[ON ERROR {LOG | IGNORE | STOP} ]
[RUN ON {DATABASE | HOSTS | PARTITIONS } ]
[AS USER user-name]
[ENABLE | DISABLE]

time-interval: integer {MILLISECONDS | SECONDS | MINUTES | HOURS | DAYS}

Description

The CREATE TASK statement schedules a stored procedure to run iteratively on a set schedule. In its simplest form, the CREATE TASK statement schedules a specified stored procedure to be run at a regular interval. The PROCEDURE clause specifies the stored procedure and any arguments it requires. The ON SCHEDULE clause specifies when the procedure will be run. You can schedule a procedure to run on three types of schedule:

  • CRON — Specifies a cron-style schedule to run the procedure as set times per day or week.

  • DELAY — Specifies a time interval between each run of the stored procedure, where the time interval starts at the end of each run.

  • EVERY — Specifies a time interval between the start of each run of the stored procedure.

The difference between DELAY and EVERY is how the interval is measured. For example, if you specify EVERY 5 SECONDS, the stored procedure runs every 5 seconds, no matter how long it takes to execute (assuming it does not take more than 5 seconds). If, on the other hand, you specify DELAY 5 SECONDS, each run starts 5 seconds after the previous run completes. In other words, EVERY results in invocations at a regular interval no matter how long they take, while DELAY results in a regular interval between when one run ends and the next begins.

For DELAY and EVERY you specify the interval as a positive integer and a time unit, where the supported time units are milliseconds, seconds, minutes, hours, and days. For EVERY, if the previous run takes longer than the interval to run, the schedule is reset at the end of the previous run. So, for example, if the schedule specifies EVERY 2 SECONDS but the procedure takes 2.5 seconds to run, the next scheduled interval will already be past when the previous run ends. In this case, the next invocation of the task is reset to 2 seconds after the previous run ends.

The CRON option requires a standard cron schedule, which consists of six values separated by spaces. Cron schedules set specific times of day, week, or month, rather than an interval. The six values of the cron string represent seconds, minutes, hours, day of the month, month, and day of the week. Asterisks indicate all possible values. For example, the cron specification ON SCHEDULE CRON 0 0 * * * * schedules the task on the hour, every hour of every day. More information about scheduling tasks with cron can be found on the web.

You can also specify details about how the procedure is run:

  • ON ERROR specifies how errors are handled. The default is ON ERROR STOP.

    • ON ERROR LOG — The error is logged but the procedure continues to be scheduled and run.

    • ON ERROR IGNORE — The procedure continues to be scheduled and run and the error is ignored and not logged.

    • ON ERROR STOP — The error is logged and the scheduling process stops. No further invocations of the procedure will occur until the task is explicitly re-enabled (by using ALTER TASK to disable and then enable the task) or the database restarts.

  • RUN ON specifies where the procedure executes. The default is RUN ON DATABASE.

    • RUN ON DATABASE — For multi-partitioned procedures, each invocation of the procedure is run as a single transaction coordinated across all partitions.

    • RUN ON PARTITIONS — For directed procedures, the procedure is scheduled and run independently on all partitions in the database. Directed procedures are useful for performing distributed tasks that are transactional on each partition but do not need to be coordinated and therefore are less disruptive to the ongoing database workload.

  • AS USER specifies the user account under which the procedure is run. When security is enabled, you must specify a valid username and that user must have sufficient privileges to run the procedure.

When using passive database replication (DR), the replica cluster will automatically pause any scheduled tasks that might modify the database (that is, procedures that are not read-only). If the cluster is promoted, the tasks are resumed.

Finally, you can use the ENABLE and DISABLE keywords to specify whether the task is enabled or not. (The task is enabled by default.) If the task is disabled, the procedure is not invoked. If the task is enabled, the procedure is invoked according to the schedule until the database shuts down or the task is disabled by an ALTER TASK statement or an error while ON ERROR STOP is active.

Creating Custom Tasks

If the standard schedules do not meet your needs — you want to change the interval between runs, modify the arguments to the procedure , or the procedure itself — you can define a custom task using Java classes that implement one of three special interfaces:

  • When you only want to dynamically control the schedule of the procedure but keep the procedure and its parameters the same, you can use the ON SCHEDULE FROM CLASS clause specifying a Java class that implements the IntervalGenerator interface.

  • When you want to use a regular schedule but dynamically change the procedure and/or its parameters, you can use the PROCEDURE FROM CLASS clause specifying a Java class that implements the ActionGenerator interface.

  • When you want to dynamically control both the schedule and the procedure being invoked, you can use the second form of the CREATE TASK syntax which replaces both the ON SCHEDULE and PROCEDURE clauses with a single FROM CLASS clause specifying a Java class that implements the ActionScheduler interface.

Before declaring a custom task, you must load the specified Java class, the same way you load Java classes before declaring a user-defined stored procedure, by packaging it in a JAR file and using the LOAD CLASSES directive in sqlcmd. It is also important to note that the classes used for custom tasks are not stored procedures and do not run in the normal transactional path for VoltDB transactions. The custom task classes run in a separate thread to identify the characteristics of the next task invocation before the specified stored procedure is run. For all three task interfaces, the task management infrastructure provides the results from the previous run as input to the callback method, which can then use that information to determine how to modify the next instantiation of the task's procedure, parameters, or run interval.

Many of the CREATE TASK statement's clauses — ON ERROR, AS USER, and ENABLE|DISABLE — operate exactly the same for both custom tasks and the simple case of scheduling a single stored procedure. The two exceptions are the WITH and RUN ON clauses.

For custom tasks that alter the procedure and procedure parameters, the arguments in the WITH clause are passed to the custom task's initialize() method rather than to the stored procedure that it runs. The custom task can then decide what to do with those arguments. For example, it may use them as initial, maximum, and minimum values for adjusting arguments to the stored procedure.

The RUN ON clause for a custom task has one additional option beyond just DATABASE and PARTITIONS. Custom tasks can also be RUN ON HOSTS, which means one instance of the task is run on each server in the cluster.

Examples

The following example declares a procedure to reset the DailyStats view, and a task scheduled as a cron event at midnight every night to run the procedure.

CREATE PROCEDURE ResetDailyStats AS
   DELETE FROM DailyStats;
CREATE TASK nightly 
   ON SCHEDULE CRON 0 0 0 * * *
   PROCEDURE ResetDailyStats
   RUN ON DATABASE;

The next example creates a custom task that dynamically changes the interval between invocations of the stored procedure. The example first loads the JAR file containing a custom task class that implements the IntervalGenerator interface and then declares the task using PROCEDURE FROM CLASS clause.

sqlcmd
1> LOAD CLASSES mytasks.jar;
2> CREATE TASK DailyNoHolidays 
   ON SCHEDULE FROM CLASS mytasks.NoHolidays
   PROCEDURE ResetDailyStats
   RUN ON DATABASE;