Queueing Background Tasks
Posted in Django
on April 4th, 2009 by
Stephen DeGrace
There are times when you want to perform a time-consuming task on the web server but not either delay the response to the user on the other end or have the request time out partially completed. I faced this problem with my gallery application, with the automatic generation of thumbnails. This article talks about the solution I decided on to queue tasks for background processing.
When uploading photos via the web page front end or even the admin interface, which provides much more upload fields the way that both are currently configured, thumbnail generation, while time consuming, is hardly prohibitive for a web request.
However, in my case, it is not simply thumbnail images which are being generated, but three different scaled sizes, plus a small thumbnail in a variety of directories for use by django-filebrowser. I would rather pay the cost of scaling up front than simply store the one large photo and incur the bandwidth cost of transmitting the whole thing on my end and have the user incur the processing cost of scaling the image in the browser to fit the parameters specified in the img tag on the browser side every single time the image is requested.
On top of that, the upload form is not the common way the gallery app is used. The normal work flow is to first create the gallery in the admin interface, which also creates the associated directories. Then, photos are simply moved wholesale via scp into the gallery root directory. I use a Nautilus script on my home computer to first scale the images from the rather overkill size coming from the typical digital camera to something a little more manageable in terms of disc space. Then, simply by requesting any view which instantiates the gallery (e.g., looking it up in the admin, or requesting the gallery list on the front end), the gallery calculates a hash of the filenames in the directory, sees that it doesn't match the stored hash, then proceeds to check to make sure each of its associated Picture objects has an image file (if it doesn't, the Picture is deleted), and more importantly to this discussion, if it finds an image without a Picture object, it creates a Picture for it. When a Picture is created, it automatically generates its own thumbnail images, and when it is deleted, it removes its thumbnails. It would be nice if the thumbnail creation, at least, was queued for a background process to handle, so that the request that caused the image creation could return without waiting.
The fact that people may be visiting while not all images are yet generated does not pose a problem, because images are displayed on the website with the {% thumbnail %} template tag which is smart enough to insert a placeholder image if the file does not yet exist.
I can think of two basic approaches to this problem. One would be to use cron, or a facility like cron, to schedule the tasks. The other is to use some kind of queue system to perform the tasks on demand. While scheduled tasks are an enormously useful tool, they are the wrong tool in this case, as semantically this is basically an on-demand activity, and that is what the user expects to see. An on-demand-like system can be created by for example scheduling cron to run a process that clears queued tasks every thirty seconds, but that would be unnecessarily inefficient, as cron would be executing thousands of tasks, most of which would be for nothing. Even if the overhead were negligible, this would be a very inelegant solution.
A number of existing solutions exist for task scheduling in Django, but I already rejected that approach. What I want is something like a daemon that the Django apps can communicate with to get it to perform arbitrary background processing, or else an application which can be woken up and run when tasks are queued. I found one app that allowed Django applications to communicate with it by http to get it to do background processing, but by that point I had already worked out a different solution, and besides which, I'm leery about using http for interprocess communication, maybe without good reason, but still.
My answer is an app I created called taskmaster. This puppy is extremely low tech, but it gets the job done well. The app consists of a Django middleware and some utility functions in its __init__.py that it exports, and then a subdirectory where the taskmaster script and its associated files live. That directory contains a subdirectory called tasks, which starts off empty. There are also two lock files, lock and addlock.
New: You can download this app here.
To use it, an app does something like this:
from django.db import models
from taskmaster import render_to_task
# Simplified version of Picture model
class Picture(models.Model):
gallery = models.ForeignKey(Gallery, related_name='pictures')
poster = models.ForeignKey(User, related_name='pictures', null=True, blank=True)
image = models.ImageField(upload_to='path/to/gallery', db_index=True)
description = models.TextField(blank=True)
def create_thumbnails(self):
"""
Queues the image to have its thumbnails created by the taskmaster
middleware/script.
"""
render_to_task('gallery/create_thumbs.py', {'picture': self})
Here is the source of render_to_task:
from django.template import Template, loader, Context
from django.conf import settings
from fcntl import flock, LOCK_EX, LOCK_UN
import os
def render_to_task(template, context):
"""
template must be a path to a template or a Template object.
context must be a dictionary or a Context object.
A Template and a Context are created or the ones passed in are used, and the
template is rendered with the context. write_to_task is called to create
the task.
"""
if not isinstance(template, Template):
template = loader.get_template(template)
if not isinstance(context, Context):
context = Context(context)
write_to_task(template.render(context))
def write_to_task(task):
"""
Writes task, which should be a string, as a task file. The output is written to
the tasks directory under settings.TASKMASTER_PATH.
Blocks on addlock so that it is not possible for two server processes to try
and add the same task number at once - one must wait.
"""
taskdir = os.path.join(settings.TASKMASTER_PATH, 'tasks')
# Lock
lock = open(os.path.join(settings.TASKMASTER_PATH, 'addlock'))
flock(lock, LOCK_EX)
try:
tasks = os.listdir(os.path.join(settings.TASKMASTER_PATH, 'tasks'))
except OSError:
tasks = []
os.mkdir(os.path.join(settings.TASKMASTER_PATH, 'tasks'), 0755)
tasks.sort(cmp=lambda x,y: cmp(int(x), int(y)))
try:
taskno = int(tasks[-1]) + 1
except IndexError:
taskno = 1
# Write the task
h = open(os.path.join(taskdir, str(taskno)), 'w')
print >>h, task
h.close()
# Unlock
flock(lock, LOCK_UN)
lock.close()
What happens here in the normal usage is that a Django template is actually rendered to create a Python script, which is written to the tasks directory of the taskmaster application, with a filename which is a number chosen to be the next sequential number available in the directory. In other words, we have actually created a crude file-based queue. The addlock file is locked on with the Unix-specific flock, so that processes seeking to add tasks must do so one at a time.
The middleware is set up to run on every response to a request and all it does it check to see if there are any tasks queued. If there aren't, the response is just passed on. If there are, the middleware fires up the taskmaster process to begin processing the tasks. It has to get a lock on the lock file before it can do so... it attempts to do so in a nonblocking fashion and if it fails it assumes that the process must already be running and just returns the response and exits. When the taskmaster process is running, one of the first things it does is get a lock on the lock file, which it holds until it's done. In this way, the taskmaster is not allowed to proliferate and consume too much system resources.
The taskmaster loads each file in sequence from lowest number to highest, executing the file as python code and then deleting the file. When it removes the last file, it releases the lock and terminates. Because the middleware copies the python path from the environment of the request into the environment of the taskmaster script, in which the tasks are executed, tasks have the same sys.path as the request itself, and statements like "from django.conf import settings" in the task files will work the "right" way (although tasks can just assume the settings are already available, since taskmaster itself imports them).
Here is the middleware:
"""
Taskmaster middleware
Note: The TASKMASTER_PATH setting MUST be set to the path to taskmaster.py
in your settings file to use Taskmaster.
"""
import os
import sys
from subprocess import Popen
from fcntl import flock, LOCK_EX, LOCK_NB, LOCK_UN
from django.conf import settings
class TaskmasterMiddleware(object):
"""
Checks whether there are tasks, if so tries to start taskmaster.py to
process them. File lock ensures only one instance will run.
"""
def process_response(self, request, response):
# Check for tasks
try:
tasks = os.listdir(os.path.join(settings.TASKMASTER_PATH, 'tasks'))
except OSError:
tasks = None
os.mkdir(os.path.join(settings.TASKMASTER_PATH, 'tasks'), 0755)
if tasks:
# Try and acquire the lock, if not possible then assuming
# taskmaster alreay has it.
lock = open(os.path.join(settings.TASKMASTER_PATH, 'lock'))
try:
flock(lock, LOCK_EX|LOCK_NB)
flock(lock, LOCK_UN)
lock.close()
# No working taskmaster, so start one.
env = os.environ.copy()
env['PYTHONPATH'] = ":".join(sys.path)
try:
Popen(['python2.5', os.path.join(settings.TASKMASTER_PATH, 'taskmaster.py')], env=env)
except OSError:
pass
except IOError:
# Couldn't get a lock meaning taskmaster already has it.
lock.close()
# Return response unaltered
return response
And here is the taskmaster itself:
import os
import sys
import settings
from fcntl import flock, LOCK_EX, LOCK_UN
from datetime import datetime
from traceback import format_exc
# Reset file handles
os.chdir(settings.TASKMASTER_PATH)
log = sys.stdout = open('log.txt', 'a')
err = sys.stderr = open('error.txt', 'a')
# Lock
lock = open('lock')
flock(lock, LOCK_EX)
# Loop tasks
while 1:
# Get the task list
tasks = os.listdir('tasks')
tasks.sort(cmp=lambda x,y: cmp(int(x), int(y)))
# If there are no tasks, break out of the loop
if not len(tasks):
break
# Set supplicant and task_summary to blank values.
supplicant = ""
task_summary = ""
# Take the first task, write the log files
try:
execfile('tasks/%s' % tasks[0])
except:
print >>err, "Exception -- %s" % datetime.now().isoformat(' ')
print >>err, "Supplicant: %s" % supplicant
print >>err, "Task Summary: %s" % task_summary
print >>err, "File: %s" % tasks[0]
print >>err, format_exc()
print >>err
else:
print >>log, "Task -- %s" % datetime.now().isoformat(' ')
print >>log, "Supplicant: %s" % supplicant
print >>log, "Task Summary: %s" % task_summary
print >>log, "File: %s" % tasks[0]
print >>log
# Done - remove the task
os.unlink('tasks/%s' % tasks[0])
# Unlock
flock(lock, LOCK_UN)
lock.close()
# Close files
log.close()
err.close()
Notice that if the task defines the variables "supplicant" and "task summary", they will be worked into the log files.
This article also gives me my first chance to really try out my pygmentation app, which can be used to create arbitrary highlighted code blocks easily from tinymce.
Comments:
There are 0 comments on this item. Be the first to comment.
Post a Comment
* Required field, your email will not be posted.