The Magically Appearing Admin

Web developers using an MVC framework produce their websites playing with their models, views and controllers. Then by adding a few lines of magic an admin system appears which allows users to add/edit/delete/view/search their models.

Examples:
Django’s Magic Admin (Also NewFormsAdmin – a branch of Django focused on making it easier to customise auto-admin)
Ruby on rails Plugins:

Read the rest of this entry »

Latent Semantic Analysis (LSA) is a mathematical method that tries to bring out latent relationships within a collection of documents. Rather than looking at each document isolated from the others it looks at all the documents as a whole and the terms within them to identify relationships.

An example of LSA:
Using a search engine search for “sand“.

Documents are returned which do not contain the search term “sand” but contains terms like “beach”.

LSA has identified a latent relationship, “sand” is semantically close to “beach”.

There are some very good papers which describing LSA in detail:

This is an implementation of LSA in Python (2.4+). Thanks to scipy its rather simple!

Read the rest of this entry »

A vector space search involves converting documents into vectors. Each dimension within the vectors represents a term. If a document contains that term then the value within the vector is greater than zero.

Here is an implementation of Vector space searching using python (2.4+). Read the rest of this entry »

Funkload Build script

November 23rd, 2007

Funkload is an open source python based unit testing tool. It serves as a good tool for load testing. We can use it to create a unit test which simulates a user browsing through a site. To test load run two simultaneous instances of the unit test and so on scaling up the number of concurrent instances.

Offical Site: http://funkload.nuxeo.org/

I have written a python based Funkload build script which:

  • Builds the Funkload configuration for multiple sites
  • Uses wget to generate sample of pages for load testing
  • Runs load tests
  • Builds HTML documentation from test results.

Read the rest of this entry »

Keeping the Cache Hot

November 15th, 2007

Problem

The exipry of content within caching architectures is only identified when a user makes a request for expired data. Hence a % of the visitors to the site will not be able to take advantage of caching.Many different caching architectures are used within a typical dynamic site. Hence the solution needs to be cache agnostic.

Architecture

Emmao bot was the name given to the python program which is used to keep the cache hot. 

Figure 1: Emmaobot Server UML Model
Emmao bot

Solution

Emmao bot has been built to act as a user agent and request pages. mod_python is used to make the apache children log their requests in a special format. Emmao bot is running in the background as a daemon process. It can be run from the webhead or an alternative server. It examines the special apache log files and adds events for when these expiry. Lib event is used to manage these events. Pages have different rankings based on analysis as emmao runs. It uses this to ensure that the most important/popular/heavy pages never expiry. Also if there is a limit on the number of pages to focus on, rank can be used to decided which pages to ignore.

The Cost

Although the number of pages that emmao bit manages can be set to limit load on the webserver, there is still an increase in traffic due to Emmao Bot.
In live production environments with Emmao bot managing 10,000 pages I have not found the peformance outway the benfit of reducing maximum user fetch time.

Links

LibEvent http://monkey.org/~provos/libevent

ModPython http://www.modpython.org/