Scheduling Background Events
From KnowledgeTree Document Management Made Simple
Contents |
Introduction
There are certain events that can be scheduled as background tasks. Various operating system environments offer different mechanisms for scheduling. This document describes how scheduling can be implemented within KnowledgeTree and various considerations when providing a background task.
Status
Proposal
Requirements
The scheduler needs to be able to run frequently, identify new work, and process it without necessarily degrading system performance.
The scheduler needs to be able to provide a mechanism for feedback to the administrator:
- jobs that are scheduled, frequency
- active jobs, workload, expected completion, etc
The scheduler could also provide similar feedback to users if there is any feedback relevant to them.
User Experience
The user experience is of vital importance, and long running jobs are not suitable for running in a web browser session. Ideally, long running jobs should be broken down into managable workloads where appropriate feedback can be provided to the user.
Implementing Cron
Ideally, a scheduling application that is cross platform would be used. Cron, which was originally written for Unix environments has been ported/cloned in other environments. As cron is available under Linux, a requirement would exist for it to be available in Windows.
Implementations of Cron under Windows
Based on a quick search on 2007-02-14, I found the following:
This is an implementation of cron in Python.
This is a win32 service written in Delphi. source provided.
- cygwin
As a 'unix' like environment, it supports the gnu cron.
This is a command line cron written in php.
This is a commercial version of cron, with what appears like a once of price tag for the source.
My preferences are between the the Delphi win32 service and php cron. PHP Cron could also made into a service via the 'win32service' php extension.
The last alternative is to write a custom scheduler.
Possible Architecture
The scheduling service should be comprised of the following:
- a schedule manager
- the background scheduler
- interface for scheduled tasks
- tasks best practices guide
schedule manager
The scheduler should include a configuration page where all tasks are listed and their frequency of activation can be controlled.
background scheduler
The background scheduler will implement a cron like service, or be a custom scheduling script.
The requirements are:
1) ability to run a script at regular intervals, say once every 5 minutes
2) ability to run a script once daily, at specific time of day
3) ability to run a script once a week, at a specific time of day on a specific day
interface for scheduled tasks
As tasks are run in the background, there should be a mechanism to track the progress of running tasks, and also see how tasks have been running.
tasks best practices
As tasks that have been targeted for background scheduling tend to be too heavy for the 'user experience', they should also be careful not to be too resource intensive that they degrade system performance.
This should be done by ensuring that the tasks can run frequently and have smaller workloads.
database
scheduled_tasks
| Field Name | Field Type | Description |
| taskid | int | |
| scheduletypeid | char | P = minute, D = Day, W = Week, M = Month |
| frequency | int | This is the interval in minutes between invocations. Null if not applicable. |
| scheduledate | timestamp | This is the date when the job will next run. Once run, it will be updated appropriately. Null if not applicable. |
| task | varchar | This is the task name. |
| progress | int | This is the status of the current task in its work. a value between 0 and maxwork |
| maxwork | int | This is the current amount of work waiting for the task. |
| statusid | char | B = Busy, I = Idle, D = Disabled |
| description | varchar | This describes the task |
| laststart | timestamp | This is the time when the task last started |
| lastprogress | timestamp | This is when the progress value was last updated |
| lastend | timestamp | This is the time the task last ended. it will be null when the task starts. |
A task that must run frequently, say every 5 minutes, will have the configuration: scheduletypeid = P, frequency =5, scheduledate = null
A task that must run once a day, say at 11pm, will have the configuration: scheduletypeid = D, frequency = null, scheduledate = (ignore date part) 23:00
A task that must run once a week, say at 1pm, will have the configuration: scheduletypeid = W, frequency = null, scheduledate = 2007-01-19 13:00 Subsequent runs will be on 7 day intervals.
A task that must run once a month, say at 4pm, will have the configuration: scheduletypeid = M, frequency = null, scheduledate = 2007-01-19 16:00 The job will only run on the day of month indicated, so avoid the last few days of the month. If something needs to be done first thing at the end of the month, trigger it for the 1st of the month and 00:00am.
scheduled_task_history
| Field Name | Field Type | Description |
| taskhistoryid | int | |
| taskid | int | this is a reference to the task |
| maxwork | int | This is the amount of work done |
| starttime | timestamp | This is the time the work was started. |
| endtime | timestamp | This is the time the work completed. |
| comment | varchar | This is a comment that may be made regarding any observations around task behaviour. |
del.icio.us
reddit

