• Home

CodingExperiments.com

Linux, PHP, and the blogosphere

Search

Category:

  • Apple Inc.
  • Apps
  • Facts
  • Fun
  • Google
  • Ideas
  • Internet
  • Linux
  • Microsoft
  • PHP
  • Programming
  • Rants
  • Security
  • Uncategorized
  • web 2.0

Archives:

  • June 2009
  • May 2009
  • April 2009
  • March 2009
  • February 2009
  • January 2009
  • December 2008
  • November 2008
  • October 2008
  • September 2008
  • August 2008
  • July 2008
  • June 2008
  • May 2008
  • April 2008
  • March 2008
  • February 2008
  • January 2008
  • December 2007

Pages

  • About
    • The Authors
  • Commenting your code
  • How to Write Papers with Groff
  • ModCMS Anti-Spam Component Set
  • ModCMS Technical Specifications
  • Regular Expressions Guessing Game
  • Saving code directly to a web server
  • The (Almost) Perfect PHP 404 Page

Meta:

  • RSS
  • Comments RSS

Awesomeness tracker

CodingExperiments at Blogged View blog authority
Free Page Rank Tool

How to Avoid Some of the Problems in Writing a Working Recommendation Engine

August 24th, 2008 by Rishabh Mishra

Introduction

When you think about it, having a computer analyze your interests and tell you other things that you might be interested sounds seriously cool. But a lot of recommendation engines hardly ever actually find anything interesting for human users. This post will cover some of the issues that software developers face when trying to write a recommendation engine, and how to reduce the impact of those issues.

Developers making recommendation engines get annoyed by

1. Speed and scalability

A lot of recommendation engines are for websites such as Amazon. The recommendations have to be served up quickly. One way to solve this is by not generating the recommendations when the page with the recommendations is being requested. The recommendations can be generated at an earlier time and, on page load, can be fetched with database queries or similar.

Another way to solve this is to use the client’s processing power to either generate the recommendations. For a web application, Javascript can be used to generate recommendation, as this Delicious recommendation engine does.

There is also the issue of scalability. The system has to be able to quickly generate recommendations for large amounts of users. It would be sad for a recommendation engine to fail, Twitter style.

2. Malicious folk

This is an issue when having a recommendation engine that deals with user generated content. It doesn’t take long to find a Youtube video with literally hundreds of unrelated tags in the hope of tricking users to watch the video.

For some recommendation engines, this isn’t a problem, but for many, it can be difficult. For the Youtube example that I gave, videos with copious amounts of tags could be ignored by the recommendation engine because it is highly doubtful that nearly every single one of those hundreds of tags are relevant.

3. Little data

It would be snap-easy if all the users that your recommendation engine will analyze had large pools of data clearly describing what the users do. It is very difficult to give recommendations based on little data.

One way to attack the little data issue is to not give recommendations until enough data from the user is collected. This results in the user having to do work and wait for the recommendations to appear, which usually isn’t a good thing.

Another way is to make the recommendations based on the data of others. Recommendations popular among the vast majority of other users (think Digg front page) could be made.

4. Nothing good to recommend

Recommendation engines work really well for big sites such as Amazon, Youtube, Digg, and Delicious because those websites have a lot to recommend. It doesn’t matter what the user’s interests are, because those websites probably have to have something that would interest the user.

Smaller applications typically do not have as much to recommend, unless they are drawing upon the content of larger applications.

In order for the programmer to solve this, more content must exist in order so at least something interesting can be recommended to the user. If the content is user-generated, encourage users to generate more content by providing some sort of benefit for creating more content.

5. Duplication

Youtube is a great example of content duplication. Multiple people will frequently upload the same copyrighted content, creating duplicate content on the Youtube servers. How can a recommendation engine give relevant recommendations, but not have the recommendations be so relevant that the user sees duplicate content.

In the case of Youtube, having the recommendation engine ignore suspiciously similar video titles might help reducing the duplicate recommendations.


Posted in Programming |

  • andymurd
    There's some good analysis of music recommendation engines on the Duke Listens blog at http://blogs.sun.com/plamere/. It's well worth searching out his analyses of the last.fm engine for some good detail.
  • adondai
    The problem with recommendation engines is that humans are too unpredictable... I always get recommendations for stuff I already know or like... there has to be some common factor between what we like and other don't but at the moment no one has figured it out...
blog comments powered by Disqus