/home/projects

TaskMan

Overview

TaskMan is a simple task management API. It can conveniently be used in systems where various tasks need to be performed in the background, sparing the creation of separate threads for each task to execute. It provides basic task management functionality, without the overhead of a complex, Cron-like scheduling system.

Features

  • Embeddable: you create a task manager, add tasks to it, an voila
  • ;
  • supports synchronous and asynchronous tasks;
  • tasks are executed sequentially, in a single thread by the task manager, sparing thread resources;
  • allows for the clean centralization of background tasks that would otherwise be handled by separate daemon threads - as is often observed;
  • supports "transient" tasks (executed once and discarded) and periodic ones (executed every n milliseconds
  • );
  • allows to capture task output (through the implementation of the TaskOutput interface); applications can thus redirect the output to a destination of their choice (a Log4j logger, a JMS queue, a database...).

Architecture

The architecture is quite simple, and involves the following classes (all in the org.sapia.taskman package):

  • TaskManager: executes tasks sequentially in a single thread (the class in fact extends the java.lang.Thread.
  • Task: this interface specifies a single method (execute()) in which task implementations must implement their logic.
  • TaskContext: a given task's execution context, created on a per-task basis.
  • TaskOutput: an instance of this interface is available to running tasks so that the latter can log runtime information to it.

Usage

Instantiating a TaskManager

The first step to go through when using TaskMan is to instantiate a TaskManager and then start it (TaskManager extends Thread). One of the constructor allows you to pass a name (that will be assigned to the thread). If you which to run the TaskManager as a daemon thread, call the setDaemon(true) method on it before start up. The code below shows how to instantiate a TaskManager:

import org.sapia.taskman.TaskManager;

public class TaskManagerMain{

  public static void main(String[] args){
    TaskManager mgr = new TaskManager("SomeTest");
    mgr.setDaemon(true);
    mgr.start();
    // use it ...
  }
}
        

Executing Tasks

Once the TaskManager instance as been created, tasks can be added to it (this operation is internally synchronized, so tasks can be added at any time, by multiple different threads). The following code adds a "periodic" task to the TM:

import org.sapia.taskman.TaskManager;
import org.sapia.taskman.PeriodicTaskDescriptor;

public class TaskManagerMain{

  public static void main(String[] args){
    TaskManager mgr = new TaskManager("SomeTest");
    mgr.setDaemon(true);
    mgr.start();
    PeriodicTaskDescriptor ptd = 
      new PeriodicTaskDescriptor("someTaskName", 
                                 5000, 
                                 new SomeTask());
    mgr.execTaskFor(ptd);
    
    while(true){
      try{
        Thread.currentThread().sleep();
      }catch(InterruptedException e){
        System.out.println("Thread interrupted, exiting...");
        System.exit(0);
      }
    }
  }
}
        

The above code adds a new task to the TaskManager through the latter's execTaskFor() method. The method adds the task on the the TM's internal task queue. The task will be executed as soon as possible, at least 5 seconds after it has been added to the TM (notice the 5000 value specified in the constuctor of the task's descriptor). Since it is a periodic task, it will be executed every 5 seconds, until the TM is disposed of. A "transient" task (one that is executed once and then discarded) is created and added to the TM as follows:

TransientTaskDescriptor ttd = 
  new TransientTaskDescriptor("someTaskName", 
                              new SomeTask());
mgr.execTaskFor(ttd);        

The execTaskFor() method is non-blocking. It returns as soon as the task is added. Internally, the TM adds the task to an internal list, where it sits until it is executed.

Once the TM instance is done with, call its shutdown() method to cleanly shut it down:

// the shutdowm method waits fot the currently 
// executing tasks to terminate. 
taskManager.shutdown();
        

Implementing your Tasks

Basics

It is quite easy to implement your own task. The Task interface imposes a single method:

public interface Task {
  /**
   * Executes this task.
   *
   * @param ctx a TaskContext.
   */
  public void exec(TaskContext ctx);
}
        

The following shows how a "Hello World" task has been implemented:

public class HelloWorldTask implements Task{

  /**
   * Constructor for HelloWorldTask.
   */
  public HelloWorldTask() {
    super();
  }
  
  /**
   * @see org.sapia.taskman.Task#exec(TaskContext)
   */
  public void exec(TaskContext ctx) {
    ctx.getTaskOutput().debug("Hello World");
  }
}    
        

A task should not throw exceptions; if it must signal an error, it should do so by logging to the TaskOutput instance that is encapsulated within its context.

Nested Tasks

From a given task, it is possible to trigger the execution of other tasks, as in the following:

public class CompositeTask implements Task{

  /**
   * Constructor for CompositeTask.
   */
  public CompositeTask() {
    super();
  }
  
  /**
   * @see org.sapia.taskman.Task#exec(TaskContext)
   */
  public void exec(TaskContext ctx) {
    ctx.getTaskOutput().debug("Executing nested task...");
    ctx.execSyncNestedTask("nested", new NestedTask());
    ctx.getTaskOutput().debug("Execution completed");    
  }
}        
        

The above code demonstrates how a task's execution is triggered by another. The following steps are required:

  • Create the nested task instance;
  • add it to the TM through using either the execSyncTask() or execAsyncTask(). The former will execute the given task synchronously; the "calling" task will block on the method until the task is executed. In the latter method's case, the nested task will be added to the TM's internal task queue. The execAsyncTask() method returns immediately.

Task Output

To provide runtime information about their execution, tasks are given a TaskOutput implementation as part of their context. The interface's signature is the following:

public interface TaskOutput {

  public void setTaskName(String name);

  public TaskOutput debug(Object message);

  public TaskOutput info(Object message);

  public TaskOutput warning(Object message);

  public TaskOutput error(Object message);
  
  public TaskOutput error(Throwable err);

  public TaskOutput error(Object message, Throwable err);

  public void close();
}          
          

The TaskOutput interface has been designed to facilitate implementation of it on top of existing logging toolkits. Yet, this is not the only reason, since a Log4j logger could have been provided directly. Another goal was to allow client applications to asynchronously receive output from running tasks, especially in a distributed environment: imagine a TaskOutput that is implemented as a remote queue; every message is actually sent through the wire to a remote client that processes the messages asynchronously. To ensure that clients are notified when a given output is "done with" (a task as finished its job and therefore does not log anymore), the interface imposes the close() method.

A TaskOutput instance is assigned to the TaskContext of a task. A TaskManager instance calls its newTaskOutput() template method to create TaskOutput objects. The TaskManager class implements this method by creating a DefaultTaskOutput instance, which logs to System.out (having a look at the source will help you to understand how to implement your own).

You could override the newTaskOutput() method in such a way as to always return the same instance; in such a case, the returned TaskOutput implementation must plan for synchronization (multiple tasks could log to the same output).

In addition, it is important to note that in the case of nested tasks, the "previous" task in the chain passes its TaskOutput onto the next one. The exception to this rule occurs in the case of asynchronous tasks: when calling execAsynctask(), a new task output object is created.

In any case, the first task that is executed (the "root") always calls close() on the TaskOutput instance. Asynchronously executed tasks always have "root" status, so they always call the close() method on their given task output object - even if a given asynchronous task has been called by another asynchronous task.

The setName() method is imposed as to allow for the TM runtime to "tell" to TaskOutput instances the name of their currently running task. This allows implementations to display the name of the currently running task, a useful functionality for tracing purposes.

The Task Context

From a given Task, it is possible to pass data to nested tasks, through the TaskContext. The latter allows tasks to export and import values (bound under a key). The code below illustrates how a given task exports a value to its context - that will be used by nested tasks:

public void exec(TaskContext ctx) {
  ctx.getTaskOutput().debug("Executing nested task...");
  ctx.exportVal("message", "Hello World");
  ctx.execSyncNestedTask("nested", new NestedTask());
  ctx.getTaskOutput().debug("Execution completed");    
}

The "message" value can be imported by nested tasks in the following way:

public void exec(TaskContext ctx) {
  String msg;
  if((msg = 
     (String)ctx.importVal("message")) != null){
    System.out.println("message: " + msg);
  }
  else{
    System.out.println("No message found!!!");      
  }
}

Good Practices

Create and release system resources in exec()

When the task manager shuts down, it waits for the currently executing task to stop and then terminates. The Task interface does not provide any hook (such as close(), dispose(), etc.) to participate in the shutdown. If tasks maintain system resources as member variables, these resources might not be release properly in the advent of a TM shutdown. For this reason, if you must create system resources in your tasks, do so in the exec() method, and release theses resources in that same method, preferrably in a try/finally block.

Use external services for core business logic

Have your tasks implement lightweigth operations. In the course of these operations, if your tasks need access to more resource consuming, core business logic, implement the latter in the form of services defined at the application level, an to which your tasks are given a reference - through their constructor.

Keep your tasks simple

A task should accomplish an atomic unit of work. This is important for two reasons: first, in terms of code, it is much easier to develop, maintain, and debug; second, remember that the tasks are executed sequentially by the task manager. To allow all tasks to be executed "as fast as possible", each task should perform as fast as possible.

Instead of implementing one large task, subdivide the latter into nested tasks. "Recursively" design your tasks, so that you have tasks that call other tasks that call other tasks, and so on.

Reuse Ant tasks

You need tasks that cleanup given file directories, copy files from directories to other directories, upload stuff on a web site through FTP, etc.? Why not wrap existing Ant tasks? You can easily reuse Ant tasks outside of Ant.

Persisting a Task Manager

It might be useful to resurrect a task manager instance across process boundaries, especially if tasks that have been enqueued MUST be executed - as part of some business process. In addition, it might also be necessary to distribute task workload accross multiple servers, implying that tasks must be sent through the wire to remote task managers. For these reasons, and probably others, the TaskManager class implements Java's Externalizable interface, and the Task interface extends Serializable.

This introduces the possiblity for reliable distributed task managers whose state can be retrieved from storage, and whose execution can be resumed. The Prevayler toolkit could be used to implement such an architecture.

Before serializing a task manager, shut it down - invoke its shutdown() method.

If you do not wish to serialize a TM per say, but only its internal list of task descriptors, you can access the latter through the getTaskDescriptors() method. You could add this list to a new task manager instance later on, through the addTaskDescriptors() method.

Conclusion

Use TaskMan to centralize background operations that typically end up in separate daemon threads. TaskMan also favors the clean subdivision of a given functionality into specialized, yet related units of work (nested tasks).