Possibly old news for many of you, but a new Valve red name has shown up on the Dota 2 Dev Forum answering questions about how matches are created. You should be able to find his entire post history here. For the most part it confirms things we’ve already suspected, but let’s go over the more noteworthy revelations.
>Match ID: 246510650
All 10 players in this game were in the 93rd-94th percentile MMR range. The difference in Elo’s between the highest and lowest player was 50 Elo points. The “noob” with only 13 wins actually had the highest Elo of any player in the match. (This was a smurf.) They did play poorly in this match, but in the previous match (246456658), in which they played against several 4000 Elo players, they had 13 kills and 1 death.
First we have the basic confirmation that matchmaking uses Elo (sidestepping the semantic debate on whether it’s “real” Elo or just similar to Elo). Also of notice here, the match in question is rated Very High in-game. This suggests that the player percentile rating necessary to be in Very High is larger than what I’ve found the match percentile to be. There’s a lot of possible explanations for this, such as a relative activity levels in different percentiles, but no real way to test any of the explanations. Regardless, it appears that player percentiles as low as 93rd will queue into games that are rated Very High. Finally, this answer also confirms that Valve does attempt to identify smurfs…
>Match ID: 246567215
The 14-win player did have the highest MMR in the game. However, it probably wasn’t really high enough to push him into smurf territory. So this was not a great match. There was a player who had waited 5 minutes, which is why a match of this relatively low quality was accepted. Thanks for reporting this. It’s very helpful.
…and that this smurf detection is still undergoing tuning.
Just to clarify, there was never any “win rate” calculation. Ever. It is true that a goal of matchmaking is to make even teams, so that the odds of the Radiant winning any given match is 50%. The matchmaker also will raise your Elo and try to put you in players of equivalent skill, which indirectly tries to get the win rate to 50%. However, it has never looked at your historical win rate and, for example, put you in a game where it knew that you were expected to lose, to end a winning streak, or given you a stomp to end a losing streak.
Just another confirmation that players tending towards 50% win rates is an indirect result of the goals of the matchmaking system (create 1) even matches 2) among players of a similar skill level) and never something that the system strives for directly.
The game looks pretty balanced to me. The Elos are all relatively close. Here are the Elo’s on the two teams:
Now it’s a pretty big spread between 2409 and 3172, and it is a legitimate question to ask why in the world would we put people together with that big of a skill differential. The answer is that we didn’t. The Radiant had a 4-stack which covered that range (the highest and lowest Elos on the Radiant were in the same party). We matched them with two 2-stacks on the Dire. One of the two-stacks had the highest and lowest Elos for the Dire, and the other two stack was in 2 of the middle slots. You were a single who was also in the middle, and there was also a single on the Radiant, also in the middle of the Elo range.
So, the average Elos of all the parties were pretty close. And, player-by-player, each team had somebody on the opposing team of roughly equal skill. Given that 4 stack with the big skill spread, I think it’s hard to come up with a better way to get them into a game.
Many of the Dota2 matches with a wide spread in skill are created to cope with pre-made groups that happen to have a wide skill spread. The only possible alternatives would be to either intentionally give these teams bad games or to disallow teams of this skill spread from queuing entirely. If you happen to prefer the latter, you can always enable the solo queue only option.
Also of note, in this game the Radiant team was 4+1 with an average rating of 2819.4. The Dire team was 2+2+1 with an average rating of 2807.4. So in this case, the larger premade group did not receive much of a handicap and still got stomped.
Just one last comment on this. Elo is a TERRIBLE way to give players a sense of “progress.” Many (most?) people reach a plateau, and their Elo stabilizes. It is simply not mathematically possible for Elo to keep increasing in general for players indefinitely as more and more games are played.
Given this reality, if players used Elo to measure “progress”, we would constantly be reminding them that they are NOT making any. That would be really bad.
There are people out there who won’t like this answer, but many of these complainers also think that plateauing is something that exclusively applies to other people.
“Noob” is a relative term. We don’t consider a person with 150 games to be a “noob”. We have some good data that by 75 wins (approx 150 games), Elo is pretty accurate, and so we rely on it almost exclusively at around that point. If you are getting matched with those players, it should be because your Elo is approximately the same. Parties can complicate things considerably. I might be able to provide further insight into why it thought the match would be a good one if you provide a MatchID.
Finally, we have this post that states that the matchmaking system relies on Elo almost exclusively for player evaluation once the account has around 150 games played. I find this one fun because 150 was the number I used way back in my matchmaking FAQ. I guess if you write enough you eventually nail something.
But anyway, this summary is biased towards the pieces I find most interesting, and that judgment is often at odds with the populace as a whole, so feel free to read his posts in entirety and reach your own conclusions.
On a sad note, the other post I wanted to run with this week has to be delayed. My script is choking on the last bit of essential data, and I have to suspect that the load of the Steam sale might be the culprit. We’ll have to see if things are going a bit more smoothly early next week.