Scheduling Background Events

From KnowledgeTree Document Management Made Simple

Jump to: navigation, search

Contents

Introduction

There are certain events that can be scheduled as background tasks. Various operating system environments offer different mechanisms for scheduling. This document describes how scheduling can be implemented within KnowledgeTree and various considerations when providing a background task.

Status

Proposal

Requirements

The scheduler needs to be able to run frequently, identify new work, and process it without necessarily degrading system performance.

The scheduler needs to be able to provide a mechanism for feedback to the administrator:

  • jobs that are scheduled, frequency
  • active jobs, workload, expected completion, etc

The scheduler could also provide similar feedback to users if there is any feedback relevant to them.

User Experience

The user experience is of vital importance, and long running jobs are not suitable for running in a web browser session. Ideally, long running jobs should be broken down into managable workloads where appropriate feedback can be provided to the user.

Implementing Cron

Ideally, a scheduling application that is cross platform would be used. Cron, which was originally written for Unix environments has been ported/cloned in other environments. As cron is available under Linux, a requirement would exist for it to be available in Windows.

Implementations of Cron under Windows

Based on a quick search on 2007-02-14, I found the following:

This is an implementation of cron in Python.

This is a win32 service written in Delphi. source provided.

  • cygwin

As a 'unix' like environment, it supports the gnu cron.

This is a command line cron written in php.

This is a commercial version of cron, with what appears like a once of price tag for the source.



My preferences are between the the Delphi win32 service and php cron. PHP Cron could also made into a service via the 'win32service' php extension.

The last alternative is to write a custom scheduler.

Possible Architecture

The scheduling service should be comprised of the following:

  • a schedule manager
  • the background scheduler
  • interface for scheduled tasks
  • tasks best practices guide

schedule manager

The scheduler should include a configuration page where all tasks are listed and their frequency of activation can be controlled.

background scheduler

The background scheduler will implement a cron like service, or be a custom scheduling script.

The requirements are:

1) ability to run a script at regular intervals, say once every 5 minutes

2) ability to run a script once daily, at specific time of day

3) ability to run a script once a week, at a specific time of day on a specific day

interface for scheduled tasks

As tasks are run in the background, there should be a mechanism to track the progress of running tasks, and also see how tasks have been running.

tasks best practices

As tasks that have been targeted for background scheduling tend to be too heavy for the 'user experience', they should also be careful not to be too resource intensive that they degrade system performance.

This should be done by ensuring that the tasks can run frequently and have smaller workloads.


database

scheduled_tasks

Field Name Field Type Description
taskid int
scheduletypeid char P = minute, D = Day, W = Week, M = Month
frequency int This is the interval in minutes between invocations. Null if not applicable.
scheduledate timestamp This is the date when the job will next run. Once run, it will be updated appropriately. Null if not applicable.
task varchar This is the task name.
progress int This is the status of the current task in its work. a value between 0 and maxwork
maxwork int This is the current amount of work waiting for the task.
statusid char B = Busy, I = Idle, D = Disabled
description varchar This describes the task
laststart timestamp This is the time when the task last started
lastprogress timestamp This is when the progress value was last updated
lastend timestamp This is the time the task last ended. it will be null when the task starts.

A task that must run frequently, say every 5 minutes, will have the configuration: scheduletypeid = P, frequency =5, scheduledate = null

A task that must run once a day, say at 11pm, will have the configuration: scheduletypeid = D, frequency = null, scheduledate = (ignore date part) 23:00

A task that must run once a week, say at 1pm, will have the configuration: scheduletypeid = W, frequency = null, scheduledate = 2007-01-19 13:00 Subsequent runs will be on 7 day intervals.

A task that must run once a month, say at 4pm, will have the configuration: scheduletypeid = M, frequency = null, scheduledate = 2007-01-19 16:00 The job will only run on the day of month indicated, so avoid the last few days of the month. If something needs to be done first thing at the end of the month, trigger it for the 1st of the month and 00:00am.

scheduled_task_history

Field Name Field Type Description
taskhistoryid int
taskid int this is a reference to the task
maxwork int This is the amount of work done
starttime timestamp This is the time the work was started.
endtime timestamp This is the time the work completed.
comment varchar This is a comment that may be made regarding any observations around task behaviour.
Personal tools