Set based ELO?

sort
DeinFreund 9 years ago Hey Brackman I just read this article about an alternative to IEEE 754 that allows to represent accurate real numbers by representing numbers as either an accurate real value or the open set of real numbers between two accurate numbers. Now I'm wondering, would it be possible to do something similar for ELO? A new player starts with (-inf, inf) as rating and this set keeps shrinking/growing so that there's a > X% chance of the player's actual strength being element of this set. Of course this can't be accurate as it can only be inferred from multiple matches, but is there something in this style that works numerically? +0 / -0
Shadowfury333 9 years ago Glicko basically does this, and AFAIK Microsoft's TrueSkill uses the same idea. It's not range based so much as statistical, where instead of the Elo value, you have a mean and standard deviation. This does produce something of a range, but it also provides distribution information, rather than assuming a uniform distribution in the range. +0 / -0
Fealthas 9 years ago Yes. I believe Forged alliance forever users this system. Your "elo" is the average of your two bound values. A new player has a "high variance" so the bound is pretty big. If two players have the same elo, no elo change occurs as the w/l is supposed to be 50%. Each time a player wins/loses this bound shrinks if the w/l ratio is around 50% at your "elo". If you win against someone thats at your lower bound no elo gain happens either, but your bound shrinks. Elo change is variable on your expected win chance- the farther away from your "elo" you are the more change there will be, unlike in our ELO where change happens even if its supposed to be 50/50. I don't remember much else, but thats the gist of it. +0 / -0
DeinFreund 9 years ago That's kind of what I've been thinking about. Including such information in team balance should help with reducing randomness in team games further. (Assuming balancer takes inteam elo deviation into account). "Trolls"(High risk players) could then be paired with low deviation players to make games more interesting and fun. +1 / -0
natch 9 years ago Honestly some of the best games are stacks of high-risk players vs stacks of low-risk players, the tide can turn so quickly +0 / -0
DeinFreund 9 years ago natch We can check that once we have a way for measuring risk willingness. Should it really turn out to be more interesting, then the balancer could be modified to enforce those games instead. +1 / -0
Brackman 9 years ago The current system already does something similar by giving every player a "weighting". The current ZK calculation is a bit strange, though. Anyway it is very good to have elo change depend on something like weighting. But I wouldn't balance around weightings itself. Also what is considered risky playstyle is not necessarily the same as low weighting. An experienced troll who always uses the same risky strategies might have a more distinct weighting than someone whose skill varies because he doesn't play that often. What could really be improved is the way ZK calculates weightings. Maybe it isn't done like in TrueSkill because of Microsoft copyrights, while FAF has a license? +0 / -0
Shadowfury333 9 years ago (edited 9 years ago) Unless something has changed, though this is from somewhat cursory research, FAF doesn't use TrueSkill exactly (apparently that involves some kind of skill tiering that FAF doesn't do), but rather Glicko with the rating means normalized to Elo ratings, so µ=1200 is still provisional and µ=2000+ is still master level. +0 / -0
aeonios 9 years ago Switching from elo to glicko2 would be doable. It wouldn't really make much difference but glicko2 is at least more informative than elo is. +0 / -0

Forum index > General discussion >