An Extremely Unofficial Dota Matchmaking FAQ part 2

Gonna jump right back in to where we left off yesterday.

5. So what’s all this about smurf detection?

The short of it is that there’s many reports of new accounts getting sent into high and even very high games after an extremely short time of being active.  Under 5 games in a lot of cases.

Let’s first examine why something like this isn’t usually possible.

Going back to the LoL example, let’s say we have a new ranked player who belongs at 1800, which is pretty close to the platinum ranking.  How many games would it take to reach 1800 if we assume +20 per win and -20 per loss?

The answer of course is that it depends on the win rate, but if matchmaking is working properly the win rate shouldn’t be a constant.  For the sake of simplicity we’ll claim that the win rate of the player by the time they hit 1800 is 60%, which implies their win rate at 1200 might be something closer to 65% or possibly even higher.

So at a 60% win rate the player should gain 40 rating every 10 games for an average of 4 rating a game.  For them to gain 600 rating they would need to play 150 games before they hit their proper rating.

150 games is a long time.  At a generous estimate of a half hour a game that’s 75 hours of playtime before you hit your proper rating.  This isn’t that far fetched.  Microsoft’s description of their Trueskill system claims that in a 4v4 game it can detect each player’s rating in 46 games.  But it goes on to say that “The actual number of games per gamer can be up to three times higher depending on several factors such as the variation of the performance per game, the availability of well-matched opponents, the chance of a draw, etc.”  Dota can’t have draws, but Dota and it’s genre companions have huge issues with the other factors.

Variation of performance per game is a no-brainer.  There are nearly 100 heroes.  They have ideal line-ups, and less ideal line-ups.  They have vastly different playstyles, and your average player is going to not be equally skilled with all of them.  From the perspective of consistency the game is a nightmare, and while this makes for a better game it wreaks hell on effective matchmaking.

But think about the “availability of well-matched opponents.”  If it ends up taking nearly a hundred hours for the game to figure out your skill level, then that’s also true for everyone else in every game you get matched with.  If the average playtime of the 10 players in the game is under 25 hours, the matchmaking is essentially a weak educated guess.  This undermines the effectiveness of the results, which makes the matchmaking take even longer to detect player ratings, which undermines the effectiveness of even more future games, and so on, and so on.

So wouldn’t it be ideal if we had a way to estimate a players skill based on their individual performance without having to even consider the match results at all?

6. But you can’t do that!  Stats without context are meaningless!

Not technically a question, but it’s a legitimate complaint.  Dota is an extremely complicated game.  We have detailed stats, but the value of any particular stat is extremely situational because of all the feedback loops built into the game.  Judging players exclusively by CS/min or K+A/D is dumb, and I do not in any way believe this is what Valve is doing.

So look at it a different way.  Suppose we have detailed stats of literally millions of games, and we now have a good idea of the relative skill level of some of our players.  Let’s make some measurement, say creep score per minute by 5 minutes on a hero by hero basis.  If we know that 99.5% of the time, any player achieving 25 CS by 5:00 with Lina happened to have a MMR of at least 1500, and in your first 3 games at 1200 you do this all three times, we could make a pretty good argument that you deserve to be at least at 1500.  You might be a 1500 player.  You might be a 2000 player.  We’re not trying to make an exact prediction of your final rating.  We’re just trying to make a prediction of your minimal expected rating.

Is CS/min by 5 minutes an effective metric?  I have no idea.  It’s just an example of how the system might work.  The actual metric or metrics could be literally anything, and would have to be backed by a ton of data, but hypothetically something like this could work and could be a potential explanation for how Valve is rapidly accelerating the MMR of certain players.

Will the system miss players?  Sure, but the rate of false negatives isn’t that big of a deal.  Those players will move through the lower ranks more quickly by virtue of having less underrated players to randomly get matched against.  False positives are, in my opinion, more problematic so I’d expect the system to be very conservative.

Is it fair?  Who cares.  MMR is a system to make good matches.  It’s not a reward scheme, and it doesn’t tell you your value as a player.  By keeping it hidden Valve can do whatever it wants to the math in order to create good matches.  They can also make hidden adjustments to the system without having players complaining why they’ve suddenly lost 100 rating in the last week.  Hidden MMR is for the best, and I hope it continues indefinitely, or at least until team matchmaking is in play.

Whatever the mechanism, a smurf promoter is a good thing for the genre, provided it can be calibrated properly.  Now that that’s out of the way, let’s move on to some other subjects

7. Why does Valve keep me at 50% by matching me with worst players as I win?

If you’re in High matchmaking or below, this isn’t what’s happening.  I mean what, do you think Valve has a list of players who they’re secretly conspiring to keep out of Very High by handicapping them with 4 of the faceless mass of terrible players they keep on a server somewhere?

Here’s what happens.  You enter the game at the starting rating.  You win doing whatever because the quality of play at the starting rating is terrible.  Eventually your rating rises and so does the ratings of the players on both teams.  Your current level of play is no longer high enough to effortlessly carry, so you start looking for reasons why you’re suddenly no longer winning (because it couldn’t be that you’re just not as good as you think you are).  Maybe you find your teammates making mistakes –which happens a lot because the people you’re playing with still aren’t very good, but neither are the opponents– or maybe you just do a Dotabuff search and pick some meaningless metrics to explain why your team never really had a chance.

But the real problem is that you’re simply not good enough to effect the outcomes of games at the MMR you’ve reached.  Every game feels like part of a random walk, and it probably is.  But it’s a random walk because you are missing opportunities to effect the outcome of the game.

“But my opponents always feed!”  Maybe they feed because you’re a non-issue because they’re playing 4v5 for the first 30 minutes.  You see this a lot with players who crush the lower brackets with something like jungle Naix because literally no one can farm in the lower brackets.  But against decent opponents jungle farm isn’t impressive, and staying in the jungle puts your team at a disadvantage that decent opponents will punish.  Maybe it looks like your teammates were feeding, but in reality your opponents have just realized that the most consistent way to win in a pub is to force the other team to feed, and until you adapt to that you’re not going to climb any higher.

Now, if you’re in Very High, you might have a case.  If you’re at the top of the distribution, you’re there with little company.  If matchmaking can’t find appropriate opponents it either has to skyrocket your queue times or reach downward.  This should improve as more invites go out and the playerbase expands.  Valve has to deal with a tradeoff between tightly matched player ratings and queue times.  In the middle of the distribution this isn’t that big of a deal, but at the top end it’s a huge complication that will be alleviated as more players enter the system.

8. How do High and Very High compare to the rating systems in other games like LoL and HoN?

Going to assume that 2.3% of the population is capable of solo queuing into Very High games, and about 16% of the population solo queue into High or Very High rated games.  This is my best guess based on the match distributions I have right now.

I found an unofficial measurement of the MMR population for HoN.  Based on their data I would estimate that High is roughly equivalent to a 1650 rating in HoN, and Very High is 1800.

LoL is…messier.  The only LoL information seems specific to ranked matches, which would leave out the many players who do not play ranked matches (and likely trend to the lower end of the bracket).  I previously said that I thought High was equivalent to 1350, but sources have since been conflicting and I no longer have confidence in any of them.

11 Responses to An Extremely Unofficial Dota Matchmaking FAQ part 2

  1. xdv says:

    You don’t need to estimate HoN’s percentile ranking vs rating, it’s publicly available on the site, all registered players are ranked by their rating and %. I remember checking up on an old friend who was really bad at playing and he was 1250 rating which was 1.6% out of 100%.

    HoN used to give new players double MMR gain / loss for the placement games, and additionally rewarded large kill streaks with extra MMR.

    Valve might be using the velocity element from Microsoft’s Trueskill, you do mention it in the post but don’t go into much detail. I’ll just chime in with what I know from having played games with the Trueskill system – the key thing they have that’s different is the uncertainty variable, also called velocity. It rewards streaks: basically, if you’re on a streak, the system realises that it’s matching you incorrectly – thus increasing its uncertainty of your rating, and increasing your “velocity” – and then widens the MMR gain or loss in order to more quickly find you a fair match.

    So lets say Dendi makes a new account, he goes on a crazy win streak. The first win may gain him 10 MMR. The 5th consecutive win may give him 30 MMR. The 10th consecutive win may give him 100 MMR. Basically the system gets increasingly desperate to find him a match he will lose. Same thing for a player who goes on a crazy loss streak. This velocity element moves a player much quicker towards his true MMR than the normal ELO system.

    On the other hand, players with say 300 wins and 300 losses will barely have their MMR changed even with a 5 game streak. However, a sustained streak will make the system reevaluate its estimation of the player’s true skill and increase the uncertainty.

    As a nice consequence of this, once a player reaches close to their true MMR, their uncertainty decreases as their win rate reaches 50%, and the MMR gains and losses get very small, so player MMR fluctuations on a day to day basis become smoothed out and don’t jump around as much in minor streaks, which used to annoy players – it’s not uncommon to have a streak of bad luck and lose 5 games in a row, but that’s clearly not your “true” skill going down and you don’t want to drop that player’s MMR by that much.

    • phantasmal says:

      It’s good to know that about the HoN percentiles. According to http://www.heroesofnewerth.com/player_ladder.php 97.7% is around 1765 and 84% is 1643. Of course the Dota2 side of things is still at best an educated guess.

      The velocity description is helpful, and yeah, it’s a feature I wouldn’t be surprised if Valve has included since it strikes at the heart of one of the most difficult challenges facing a team oriented game like Dota. Many of the complaints about matchmaking being streaky could easily have been a oversensitive velocity setting, since as you say Dota can be very streaky for reasons other than personal player performance. It’s easy to imagine a player getting stuck in a loop where velocity orbits them above and below their true rating range, leading to win and loss streaks until chance offsets the influence of the velocity variable again.

      I do believe that there’s something more going on here than just velocity though. By many of the reports the system is capable of detecting and adjusting new accounts into high and very high even before enough of a win streak has built up for velocity to come into play.

      Edit: Oh yeah, I have one more part coming up and I was wondering, does HoN have any restrictions that force solo-queuing or small group queuing in its ranked mode? Search has been unhelpful, and I’m not familiar enough with its recent changes to know myself.

  2. xdv says:

    I haven’t played HoN in many months either, what sort of restrictions were you thinking of?

    All they do is the same as DOTA2 or LOL, where if there’s a 2 stack on one team they’ll make sure there’s a 2 stack on the other ( search time permitting). If you queue as a 5 stack, there’s a good chance you’ll get a 5 stack as well, or a 4 stack + 1 random, or a 3 stack + 2 stack.

    • phantasmal says:

      Well LoL, unless they’ve changed it in the past couple months, prevents you from queuing into ranked matches with a premade group larger than 2. I was wondering if HoN has anything comparable. And if it doesn’t I’m a little curious how they handle matchmaking larger pre-made groups at the top of the ranked MMR distribution, since by my estimate that’s the thing Dota’s matchmaking struggles with the most.

      • xdv says:

        Oh that… no there is no restriction in HoN. LoL can do that because if you have more than 2 players, they want you to join the team-queue instead. I suspect DOTA2 is moving along this direction as well.

        I’ve actually been of the opinion that solo queue be abolished instead. Guild Wars 1 took this approach and I thought it was quite successful. People actually used the lobby and friendlists… you could only queue as a full team, and it didn’t stop it from having a very large competitive following, and the pubs worked fine. Pub-leaders would advertise spots for specific positions to fit the strategy they were going to run, or players would advertise availability to play certain roles or heroes. It certainly worked a lot better than just pressing “Play” and getting a randomly bad team composition, and it didn’t take more than a few minutes to get into a game in any case.

      • phantasmal says:

        I expect Valve to take steps to promote group organization over solo queue, but I don’t think they’ll ever go so far as to abolish solo queue. Guild Wars can get away with it because there’s an underlying game that doesn’t require grouping, and PvP can be treated as a separate, emergent ecosystem. Dota only has PvP, so there needs to be a variant of PvP that’s available to someone without social connections in order for the game to not be immediately off-putting to new players.

  3. Fredrick says:

    I solo queue into the low end of High in DotA 2 almost every game. (I say low end because I still end up in a normal game from time to time.) My LoL ranked rating before I switched over varied from 1300-1500 any given week. So the 1350 estimate might not be too far off.

  4. NOPE says:

    Another thing with the smurf detection you didn’t touch on: It seems that it tries to match smurfs with smurfs. Going through some matches a friend played on an account that got put into very high in ~15 games, the average # of games played across the teams are much lower than what I see on my very high 500 games account.

    • xdv says:

      It’s not smurf detection per se: the developers have stated on their forums that the matchmaking system matches players on two factors – their MMR rating, and their # games played.

      This is to prevent the problem that HoN and LoL initially had – in a more or less zero sum MMR system, all players enter the game at the “average” MMR of the entire system, roughly at the 50th percentile, while their skill level would more likely be in the bottom 10%.

      This meant there was a huge variance in player skill if you were finding matches in the middle percentiles of skill – like between the 40th and 60th percentile – someone with a 1500 MMR (in HoN, this is the default starting MMR) could be a total newbie with 0 games played, or he could be a genuine 1500 MMR player who’s played 1000 games and the system just thinks he’s middle of the road.

      Eventually HoN also implemented this 2 factor matching, where it looks at your games played as a factor in matching too. This ends up segregating players effectively into 4 buckets instead of just 2.

      HIgh Skill + High Games played = actual pro players
      HIgh Skill + Low Games played = smurfs
      Low Skill + High Games played = actual bad players
      Low Skill + Low Games played = newbies (it’s important that all newbies play in the same pool together)

      • phantasmal says:

        I hadn’t seen the game details of enough smurf accounts to notice that trend. xdv might be on the right track in that the system tries to make placement games of a sort at multiple skill levels, but I still feel that there’s some additional smurf detection mechanism at work on top of this. The game just seems remarkably fast at figuring out that certain accounts are high skill, too fast for match results alone to be the determining factor.

        It’d be kinda interesting to see a collection of the links to a bunch of different smurf accounts along with the opening skill selection and the game that first sent them to high/very high. Another possible but time-intensive test would be to create two separate smurf accounts, one at home and one at a friend’s house who has never played Dota, and see if the system is equally quick at promoting both. Alternatively, varying the account creation by the source of the Dota2 invite. These might turn up nothing, but even that would help narrow the range of possibilities.

      • ganknamstyle says:

        Smurf on winning streaks can go to High Skill faster based on their game performance in each game. Your kills+assist relative to you teams performace might be used as a supplement in increasing your velocity. A smurf who is involved in most team plays is likely better than one who only farms first before carrying his team to victory.

Leave a comment