Tag Archives: Statistics

Automatic Tag Generation

22 Oct

This project looked at dynamically generating suggestion tags for content. To simplify the task some constraints where introduced.

  • The content which will be tagged is news articles with HTML markup.
  • Only English content.

I used the following HTML page to experiment on with suggestion tags: http://news.bbc.co.uk/1/hi/entertainment/6624223.stm

To help evaluate the tagging methods I asked a sample of people to suggest what they thought the best tags would be. They came up with:

paris, hilton, paris hilton, jail, jail sentence, drink-driving

(more…)