AJ Repost #2

#0 - Jan. 22, 2009, 6:26 p.m.
Blizzard Post
Source: http://www.arenajunkies.com/showthread.php?t=59023

"Background:
------------------

A persistent player rating system was first hinted at back in TBC as something that might be introduced in WotLK. When I first read about it I was shocked because (as with anyone who played Wc3 competitively) I was aware of how easily the system can be abused, and how it kills competitive PvP.

I sincerely hoped that this was merely an inexperienced member of their team getting a little over-excited with a naïve idea for improving match-making. I sincerely hoped that their development team had the competence to stop this ever going live.

The primary motivation for a persistent rating system was to avoid pairing poorly skilled players against highly skilled players because bad players don't enjoy being stomped. We will later see that this is impossible to achieve since a player's "skill" level is something that that may be projected to anything a player wishes that is below his actual skill level, but we will go into this in detail later.

Anyway, 3.0.8 came along and out-of-the-blue the new and untested system was dumped upon us. What made it even more unsavory to swallow was that it came with major bugs leading to widespread chaos with skyrocketing ratings, prestigious achievements being awarded for mediocre play, and general inconsistency.

However this post is not concerned with these bugs, and in fact these bugs are currently clouding a far more serious issue. The issues I'm trying to highlight exist with any persistent rating system, and they would still exist even if Blizzard had smoothly implemented the system. Even if they had kept the elegant zero-sum scoring system for team ratings and used the persistent (or hidden) rating solely for match-making, the problem would still exist!




The fundamental flaw with a persistent rating system
-------------------------------------------------------------------------

Essentially a persistent rating attaches a sense of "state" to a player that persists between arena teams. This "state" affects how the player is to be paired, and therefore affords the player a means by which to manipulate how he is paired. A player will therefore aim to be in the optimal "state" before beginning a competitive team.

Laughably Blizzard have hidden this persistent rating (and gone so far as to call it a "hidden" rating) in the naïve hope that players will not attempt to manipulate that which they cannot see.

A simple way to abuse this system would be to deliberately play badly for many games: It need not require losing every game, even winning 50-50 at a low rating will be enough to establish a low persistent rating. When this low rating is established, the player can then simply begin a new team, start playing well, and reap the benefits of beating players who are now judged (by a flawed system) to be more skillful.




Flawed regardless of the rate of evolution
---------------------------------------------------------------

A persistent rating attempts to judge a players skill level by looking at his current team or personal rating. This means that the persistent rating must evolve according to actual ratings. For a persistent rating to be meaningful and persistent it must evolve slowly and remain stable. However this makes establishing a false persistent rating more potent, and so the system may be abused to a greater degree.

A fast evolving persistent rating undermines the whole desired purpose of a persistent rating system since a player can easily "reset" his persistent rating by playing poorly: Suppose a gladiator has been playing at 2300+, and wishes to boost a friend to 1800. The gladiator need only quit his team, begin a new team and play badly for however long it takes to decrease his persistent rating before helping his friend out.

Either way the system does not work - whether it evolves quickly or slowly or anywhere in between it will be susceptible to establishing a false persistent rating. Even if the rating were non-decreasing, you would see players deliberately maintaining low ratings (or even rerolling characters!) to establish a convincing low persistent rating so that they could capitalize towards the end of the season.

In fact it's impossible for a persistent rating system to succeed: if it's potent enough to pair low skilled players consistently and award them for occasionally beating more skillful players, then it will be potent enough to be heavily abused. If it is not potent enough to achieve this, then it servers no purpose other than to require a "reset time" for players wishing to help out lower-rated players.



------------------------------------------------------------
Stop this folly!
------------------------------------------------------------

I play this game for arenas, and the one thing they really had going for them was a solid zero-sum rating system that was stable, elegant, and truly reflectiv
#115 - Jan. 22, 2009, 10:20 p.m.
Blizzard Post
Although I can't go into great detail at the moment, I'd like to make a few comments.

- The team/personal ratings and the hidden ratings are related to each other. Your team/personal ratings essentially drift toward your hidden rating, so if you deliberately lower your hidden rating and remake a team, your team/personal ratings simply won't get up to high ratings until your hidden rating does. His exploit as proposed simply doesn't work because his assumptions about the relationship between the visible and hidden rating are incorrect.

- Yes, a group of good players could still deliberately stomp nubs by intentionally losing lots of games to lower their hidden rating. A rating system won't stop that, but it still solves most cases of high rated players beating up on lower rated players. First, many of the cases where high rated players were being matched against low rated players wasn't because the high rated players were deliberately trying to stomp people, it was often accidental as high rated players had normal/natural reasons to change teams on occasion (helping friends, trying out new players, new comps, new strategies, etc). Second, it takes far more effort to deliberately deflate your rating down to low levels than it did to simply create a new team.

- Something that many of you have mentioned is that the rating adjustments are now not always "zero sum". However, a rating system doesn't need to be zero sum in order to prevent inflation. The new system uses a bayesian prior distribution in order to prevent inflation, which in simpler terms means there's math that enforces that players won't have ratings outside of some range (when working without bugs, of course). For the population size we have for arenas, this means we should see ratings range from around 600 to 2400 (again, assuming no bugs).

- There were two bugs with the system that we're in the process of fixing before we reactivate arenas (should be soon) that were causing serious problems (and distorting everyone's perspective on the system). One was in the functionality that seeded everyone's hidden rating based on their pre-3.0.8 patch performance, causing players to "drift" toward wildly inaccurate ratings, the other being in the way that personal ratings were adjusted.
#132 - Jan. 22, 2009, 10:34 p.m.
Blizzard Post
Q u o t e:
Do any of you think ratings really matter with the current state of PvP?



PvP definitely needs to become less bursty, and we will be addressing that. However, I still think the rating system matters.
#147 - Jan. 22, 2009, 10:40 p.m.
Blizzard Post
Q u o t e:
For future reference Kalgan, doing something like this mid season is pretty terrible to alot of your players.


I think you say that with the assumption that if it had been the start of the season, the bugs wouldn't have been there (not a good assumption in this case imo). The new system (when working) doesn't significantly change how hard it is to get the gear or who you have to beat to get the title/mount/etc (that's still entirely relative). It also tends to be mid/late in the season when the problems with the old system were the most exposed (with highly rated players starting new teams, helping friends, etc).
#171 - Jan. 22, 2009, 10:50 p.m.
Blizzard Post
Q u o t e:


Why not put this on the PTR... Your internal test are obviously horribly wrong.



It was on the PTR. I blame you for not catching the bugs. jk ;]

As you may have noticed though, players don't participate in the arena ladder on one PTR server the way they do on a live battlegroup.
#187 - Jan. 22, 2009, 10:56 p.m.
Blizzard Post
Q u o t e:
Hey kalgan is there gonna be a roll back?!


I'm avoiding this question until we're ready to re-activate arenas (in case something we want to do turns out to not be do-able).
#294 - Jan. 23, 2009, 12:08 a.m.
Blizzard Post
Q u o t e:


Based on what we saw yesterday and the day before, the 2300 rated team that lost will lose very few points due to the Spooky Ghost Rating, henceforth known as SGR.

:)



Correct. And yes, re-acquiring your personal rating for that team is under consideration (although some of the considerations are technical, not purely design).
#340 - Jan. 23, 2009, 1:14 a.m.
Blizzard Post
Q u o t e:
Tom, there's several teams that I personally know of and verified to have 3000 personal rating as well as team ratings above 2500 when we previously had the #1 rated 2v2 in the world at 2387 for comparison's sake. Do you not see how much of a joke the ladder would become if things are not completely rolled back (achievements, rating, gear earned, etc)?


I don't expect them to keep their rating.
#341 - Jan. 23, 2009, 1:16 a.m.
Blizzard Post
Q u o t e:
No ones gonna adopt "SGR" ;/


The SGR says it likes that name, not sure whether we do anything to stop it at this point.
#372 - Jan. 23, 2009, 1:56 a.m.
Blizzard Post
Q u o t e:


Please just explain it Kalgan. From what I understand, its exactly the same as Personal Rating... except it doesnt reset when you leave a team and spans across the entirety of arena. Used for queueing and win/loss points only, leaving the visible personal rating required to buy gear so you still just cant team-hop to buy whatever you want.

Correct?


It is not like personal rating. The hidden rating uses a different approach to determining player ratings. It's more complex, but it tends to figure out a player's skill level far more quickly/accurately than an Elo system does.

However, changing player's actual team/personal ratings extremely rapidly isn't exactly what we want. So, we're using an approach where we use the hidden rating (internally we refer to it as the GDF rating since it uses gaussian density filtering) for matchmaking since it creates better matchups than Elo and we use it as an anchor to figure out where the team/personal ratings should be moving toward.
#378 - Jan. 23, 2009, 2:02 a.m.
Blizzard Post
Q u o t e:


Kalgan, straight-forward question:

Were you implicating that ratings, etc will get rolled back or implying that teams with exploited/cheesed ratings will likely not be able to retain them due to not being skilled enough?

Oh, and GDF it is! Bad luck, Affix.


Yes, I'm implying that we hope to roll back team/personal ratings back to what they were right before 3.0.8 went live.