Thursday, May 17, 2007

Why the ELO Rating System Rocks

The ELO Rating System has been around for decades. It was originally used for rating chess players but is portable to other games and really fits in well with web games. We've been using it in various projects for 5 or 6 years.

Here's what it does:
  • First of all, ELO is used to rate players based on skill level. It is generally used for two player games like chess. But with some modifications it can be used for more than that.
  • A new player is assigned a default rating, say 1200.
  • Two players compete and end with one of three results: player 1 wins, player 2 wins, tie.
  • The two player's ratings are fed into an algorithm along with the end state of the game and a new rating for each player is returned.

If two players both rated 1200 played and player 1 wins. Then player 1 will have a rating of about 1205 and player 2 will be 1195.

What makes ELO so cool is what happens over time. Players that win a lot end up with higher ratings. But the higher rated player starts to see diminishing returns for defeating low ranked players. So in order for a high ranked player to increase his rank, he must defeat other higher ranked players. If a high ranked player loses to a low ranked player, he loses much more of his rating then he'd gain if he won the match.

Over time the game players will end up being rated based on their skill level rather than other factors.

Last year we (at Electrotank) created a series of 10 multiplayer minigames for The Daily Show with Jon Stewart. These multiplayer games support 2 to 4 players per game. Mike Grundvig wrote a modified ELO system to support this. The end result is a pretty cool rating system. Players can view their own ratings and the ratings of other players in the game.

You can download an example ELO algorithm written in AS2 by me here: DOWNLOAD

Or here is the code:



function determineRatingConstant(rating:Number) {
var returnVal:Number = 10;
return returnVal;
if (rating<2000) {
returnVal = 30;
} else if (rating>=2000 && rating<2400) {
returnVal = 20;
} else {
returnVal = 10;
}
return returnVal;
}
function calculateNewRating(player1Rating:Number, player2Rating:Number, player1Victory:Boolean):Number {
var outcome:Number = 0;
if (player1Victory) {
outcome = 1;
}
//Takes the rating of two players, the outcome of the game, and returns the new rating of the first player
var d:Number = player1Rating-player2Rating;
var exponent:Number = -d/400;
var expected_outcome:Number = 1/(1+(Math.pow(10, exponent)));
var k:Number = determineRatingConstant(player1Rating);
var new_rating:Number = Math.round(player1Rating+k*(outcome-expected_outcome));
return new_rating;
}

//Example - create a player and make him beat 1000 newbies. See how his rating plateaus
var player:Object = new Object();
player.rating = 2000;
for (var i=0;i<1000;++i) {
player.rating = calculateNewRating(player.rating, 2100, true);
trace("i: "+i);
trace(player.rating);
}



ELO is O.G.!

16 comments:

lostbythelake said...

Hi there Jobe,

I was wondering if there is such a way to have this code hook-up to a coldfusion backend and have it update player ELO's during a live match where administrators could just update ratings in realtime?

Jobe Makar said...

Hi,

Yes definately. You wouldn't want the client application to calculate ELO in this case. You'd let the the server do it. The code should be easy to port to pretty much any language.

Matijs said...

My javascript is a little rusty, but doesn't determineRatingConstant always return 10?

Anonymous said...

[quote]Two players compete and end with one of three results: player 1 wins, player 2 wins, tie.[/quote]

Is it possible to implement an elo rating if there are more then 3 victory/defeat conditions?

I'd like to use this rating for another gamesystem, with 7 victory/defeat conditions.

Anonymous said...

Hello Jobe

For new entrant players what should be the default rating?

Thanks
Andres

name said...

Very interesting post, helped me a lot!

Anonymous said...

Would love to see the multiplayer algorithm Mike came up with.

Attila said...

Hi Jobe,

I would like to ask from you something. I have a gomoku(mind game) school and I want to make an ELO rating system. I have every formulas which are needed expect one. The most important. How to begin this whole thing? Suppose, none of the player has rank. But we already played some tournaments. I got the details. How to begin to count the ELO rating? What is the formula at the very beginning? I hope I could explain my question:)

Thanks

Jobe Makar said...

Hi

Generally you start with a default value. 1200 is one we've used.

cruelbone said...

Hey, Jobe

I play pool online at MSN games and they use some variety of an Elo system to calculate ratings. I've been playing for about 3 years and consider myself an intermediate-level player. My rating can easily vary from 1850 to 2100 over the course of a couple dozen games, which seems like too much volatility to be very useful. I have assumed that the problem is that pool is not a game of pure skill, particularly 9-ball. Do you agree and, if so, are there ways to tweak the calculation constants to better compensate for the effect of chance?

Jobe Makar said...

Hi,

I think pool is a game of skill. Seems like ELO would fit well. There is some chance of course, but still.

One thing I've seen done in the past is to put clamps on the max point change that can happen as the result of one game. That could do it.

Daniel Taylor said...

Hi Guys, I need help.

I have implemented the Elo algorithm on a gaming website, but we have a system where two players band together allowing one to win repeatedly, resulting in an inflated score. The winner generally wins the tournaments and collects the prizes. Is there a standard fix for this? How do systems, especially in online gaming, react to organised cheating like this?

help would be appreaciated.

Jobe Makar said...

Hi Daniel,

Protection for this is already built into ELO. You say that A and B compete and B lets A win repeatedly. If used properly, A will get lower and lower rating bumps the bigger the distance between their two ratings gets. So eventually A will get 0 for beating B, but B would get a lot if A loses.

Other than that, you could put in a cooldown period. They can only be rated against each other X times per day.

Daniel Taylor said...

The system does in fact post progressively smaller increments to their scores, however, the system still allows the cheaters to gain a score in excess of 1800 before plateauing. The cheating players still get scores higher than any of the legal players can generate in the same time-frame.

I think the cool-down approach sounds the best.

Thanks for the advice.

BJ said...

Elo chess ratings. Hmm ..... How strange that the modern chess masters have higher ratings than Capa, Lasker, Rubinstein, Morphy, Steinitz, et al. Does this mean that the moderns would have defeated the old luminaries if the old guys had been blessed with the advantages so readily available to-day? Hmm ... I have my doubts. Imagine Lasker's being able to devote all his time and energy to chess study! What would his Elo rating have been? Imagine the wealth of modern chess literature and computer riches being at the disposal of Capa or Alekhine! How would Anand and the guys have coped with them?

Never forget Newton's words: "If I have seen farther than others it is because I have stood on the shoulders of giants."

paul said...

Thanks for the article!

Clear and concise and it really helped me out of a tight spot.

Cheers,

P