Elo Statistical Rating System


#1

hey everybody–

I’m wondering if anyone has ever considered applying the Elo rating system to our fighting game tournaments.

(for anyone curious, the Elo system is the system used to rate chess players).

I’ve read about it, and it seems easy enough to implement. I would only need the statistics (match outcomes from a decent sized history of tournaments).

How far back does the Evolution staff keep tournament information? The system is more effective as the number of statistically significant matches increases. In the best case, I complete history of match outcomes (including qualifier matches) would be best.

Anyway, to sum it up:

Pros: Based on statistics, semi-objective, not too hard to implement. “accuracy” converges as more statistics are accrued.

Cons: Need to assume a rating for “new” players. The system (for chess) has it that the average competitor has a rating of ~1500-1750. A best guess scenario is to give a rating similar to this to any player who has no statistical history for calculations. We probably don’t have a very large database to work with here.

Has anyone tried this before in our community? Thoughts?


#2

The closest thing was Apex rankings


#3

Elo is fairly easy to implement. i’d personally write a computer program to do it. (I’m also pretty sure it’s out there for free).

The only time someone’s score changes is when a match is complete. He’s briefly what happens:

Player A (rating: A) vs. Player B (rating: B)
suppose A>B,
the Elo system allows you to compute the expected outcome of the match (ie: a probability that A wins, B wins, etc…
You then compare this expected outcome to the real outcome. If the expected outcome was not accurate, the ratings are updated. (and there’s a simple formula for how much the ratings change, but the change is proportionate to how badly the expected outcome missed the mark).

So, there’s no punishment for not playing. (your score will stay wherever it is). You really only get punished for losing to someone with a lower rating, or perhaps for not beating them as badly as your rating implies you should etc.

The problem is, that all players would have to start with the same rating when someone goes about using this program to calculate the historical evidence to get current ratings. So the system takes a while to differentiate good players from bad players. With enough data, the trend will become clear and the ratings will be more “accurate”.


#4

Yes, yes, and yes. It beats the APEX ratings (sucks that you lose points for not doing anything).

Also Elo is more accurate to the strengths of each players.

What do you guys say we implement ELO?