A few weeks ago there was a post on reddit advertising this site: http://siyobik.info.gf/misc/dota2/
Basically they grabbed a bunch of matches off Dotabuff and found the win and usage rates of a bunch of different team compositions. They split the team compositions in a variety of ways, including Ranged vs Melee, Strength vs Agility vs Intelligence, and using the in-game hero descriptions.
Because they were grabbing matches off of Dotabuff, they couldn’t separate the matches by skill level. My sample, on the other hand, is divided by skill level, so I decided to recreate their Carry data (the category I found to be most interesting and worthwhile) by skill level to see what would happen. Instead of finding out anything interesting about how the importance of carries varies by skill level, the results have led me to question the validity of their results.
First, their results:
Win Ratio Popularity 0 64.1% 0.4% 1 57.5% 7.5% 2 54.1% 30.6% 3 49.2% 38.3% 4 44.2% 19.4% 5 38.2% 3.8%
We all joke about AP pubs picking too many carries, but these results were still astonishing. Carries appeared to be complete poison to a team’s win chances. And keep in mind that they claim to be using the in-game definition of carry which includes nearly half of the currently released heroes. Check it out for yourself.
So I ran (what I believe to be) the same test on my sample. Here’s my results:
Normal Win Ratio Popularity 0 48.84% 0.66% 1 48.49% 9.37% 2 51.25% 32.37% 3 50.08% 37.05% 4 49.08% 17.58% 5 45.85% 2.97% High Win Ratio Popularity 0 54.63% 0.55% 1 51.41% 13.05% 2 50.68% 39.53% 3 49.92% 35.03% 4 46.51% 10.93% 5 42.54% 0.91% Very High Win Ratio Popularity 0 46.24% 0.88% 1 49.42% 15.27% 2 50.25% 40.46% 3 50.47% 33.82% 4 48.64% 9.01% 5 46.79% 0.56%
Comparing their total sample with my normal sample shows that the popularity percentages are reasonably close with whatever differences falling within a reasonable amount of sampling error. The win rates on the other hand bear no resemblance whatsoever. Their win rates essentially claim “less carries = better”. Mine show a series of normal distributions where the ideal number of carries (as defined in-game) is 2, with either 1 or 3 being fairly acceptable.
So I decided to try again. This time I used # of Melee on a team.
Win Ratio Popularity 0 55.3% 1.6% 1 51.8% 14.1% 2 50.5% 36.5% 3 50.0% 34.7% 4 46.3% 11.8% 5 42.6% 1.3%
Again, a fairly simple story: “less melee = better.”
Normal Win Ratio Popularity 0 42.53% 2.03% 1 44.80% 15.43% 2 49.52% 37.75% 3 52.19% 33.89% 4 53.94% 10.05% 5 49.39% 0.84% High Win Ratio Popularity 0 42.60% 3.07% 1 48.18% 20.79% 2 49.46% 41.80% 3 52.29% 27.60% 4 52.69% 6.38% 5 58.57% 0.35% Very High Win Ratio Popularity 0 44.83% 4.14% 1 48.94% 26.32% 2 50.55% 42.97% 3 50.57% 22.55% 4 53.47% 3.82% 5 48.72% 0.20%
Again the popularity results are similar, but this time the win ratio results are even more at odds with each other. In fact, if you look at their win ratios and my High win ratios the trends are almost perfectly inverted.
But unlike in the carries test, my claim here is more startling. My stats tell me that a 3 or 4 melee team is at no significant disadvantage, and this is squarely at odds with conventional wisdom. Can I really be confident that I just didn’t screw up somewhere?
So I did some double checking. I found out the expected win rates of melee heroes based off of this test. The method was pretty simple. If I had 30 games in the sample with 2-melee teams, and they went 10-20, then I know the melee characters collectively had a record of 20-40, because 2 melee heroes won in each win, and 2 lost in each loss. Repeat this for 0, 1, 3, 4, and 5 and you have the total win rate, which in normal ended up being just over 51%.
Then I went to the hero spreadsheet that I posted about previously. I filtered out the ranged heroes and found the melee exclusive win rate, and it was very close to 51% both with and without adjusting for usage rates. The numbers appear to check out.
And realistically why shouldn’t they? Are melee heavy line-ups really that bad? There are plenty of viable melee mids. Several melee junglers. Many of the best off-laners are melee, including Bounty Hunter, Dark Seer, Tidehunter, and the potential up-and-comer Magnus. All you need is a decent all-melee safe lane like Phantom Lancer + Ogre Magi and you have a perfectly viable all-melee team. And if you allow for a single ranged hero your options open up dramatically. The real problem with melee heavy line-ups is when you do something like send Juggernaut and Anti-mage to the off-lane against a proper safe lane setup and jungler, which is bad for a multitude of reasons beyond just being double melee.
So we have two tests in disagreement with each other. One possibility is that at least one of the tests is flawed. The other big possibility is that the disagreement is just a reflection on different samples. Mine are purely early 6.74. His samples could be from 6.75, or they could have come from a different hero release. If his sample came from the day of Meeepo’s release, that could easily warp the results tremendously. If his sample came from 6.75c (which I’m almost certain it didn’t, but hypothetically) then it wouldn’t be that surprising if his melee vs ranged results differed from mine given the surge of Drow Ranger in popularity and performance.
At the moment I don’t have the information to do much more than speculate. But what I can say is that I wouldn’t put a lot of stock in his more exotic categories. For instance, the win rate of 5 Durable teams is 61.4%? That’s great, but they make up .2% of his sample, which comes out to about 80 games. If I counted correctly, there are 33 different Durable heroes currently in Dota 2. You can figure out the number of combinations if you want, but I can guarantee you that 80 games is not an acceptable sample. And don’t even get me started on the absurdity of adding up Carry/Disabler/Initator points. Carry works as an evaluator because it estimates farm requirements (and farm availability is a fairly zero-sum game). Melee vs Ranged works as a category because it estimates lane control. Initiator as a category includes characters as absurdly varied as Faceless Void, Meepo, Tidehunter, Sand King, and Silencer. You might as well categorize teams by how many of their heroes have names that begin with letters in the first third of the alphabet.