The first script I wrote for the API was more of a practice run really, but the results turned out to be rather interesting. The goal was to count the number of games played in each skill level (Normal, High, and Very High, according to the client’s recent game search) on a given day.
The way I accomplished this was by running a search in a specific skill bracket over a narrow range of time, say an hour, to get the total number of matches that took place in that skill bracket. I then shifted the search parameters back an hour until I had repeated the same search for all 24 hours. Add together all the results and you have the number of games that took place in a day at that skill level.
One potential complication was that the API can only return a max of 500 results, so I knew that if I was ever getting 500 results I would be dropping matches. To avoid this I adjusted the range size by skill level. Very high worked fine with hour long searches. Normal required a new search for every 2 and a half minutes.
The other complication I discovered later is that Normal serves as a sort of catch-all bracket. It includes bot matches and all matches with less than 10 players, which I assume are matches with early abandons that don’t count for matchmaking stats. My original stats did not take this into account, but I’ve since then established rough estimates on the average proportion of real Normal matches to bot and abandon matches.
The original stats were:
- 103,799 – Normal
- 17,702 – High
- 3,652 – Very High
From my experience since then I estimate that 13% of Normal game returns are actually bot matches. Of the remaining 87%, around 2% are abandoned games. Adjusting the original results for this gives us:
- 88,499 – Normal (80.5%)
- 17,702 – High (16.1%)
- 3,652 – Very High (3.3%)
- 109,853 – Total
It’s important to keep in mind that this is a distribution of the games when what we really want is the distribution of the players. However, it’s likely that the player distribution is fairly close to the game distribution. There’s a couple of effects that might shift the game distribution away from the player distribution (differences in game frequency, group queues, and players near the border getting pulled up or down for matchmaking), but it’s unlikely that any of these add up to a swing larger than a percentage point or two.
So based off this, here’s some wild speculation
1. The current Dota2 skill brackets are 80/16.5/3.5 because they feel like the kind of numbers a human might pick. This only reflects an approximate measure of the population percentages. That is to say ‘x’ was chosen as the minimum rating for Very High skill because 3.5% of players tend to be above that rating (assuming a period with no rating inflation).
2. The previous Dota2 skill brackets were likely either 33/33/33 or 30/40/30. This is why people who were formerly in the highest skill bracket because they were in the 70th through 80th percentiles can now be in the lowest.
3. If High is the top 20 percent of players, then solo queuing into high ranked games is comparable to somewhere around a 1350 rating in League of Legends, if Riot’s description of their rating system is to be trusted. Unfortunately I’ve been unable to find a similar statistic for HoN.
And finally, a caveat about these tests. I was unfortunately only able to run it once before the recent API bug made it impossible to check against other days. The day I chose for testing was the day after the release of Keeper of the Light, Nyx Assasin, and Visage, so the level of activity may have been unusually high. Hopefully once the International is over and the API changes come out I’ll be able to run it again to see if the results are consistent. In the future I’d also like to try to run the same test over multiple months to see if there’s any inflation or deflation in the number of high and very high skill games.