Updated the Table of Contents page and added a new Sample Methodology page so that new visitors to the site could easily find a description of what I do without it having to be tacked on to each article.
There will probably be no new testing update next week. I’ve decided I want to expand my general sample for the upcoming tests, and my target is March 20th. Maybe next week will be devoted to some theory ramblings in preparation for some of the future tests.
A lot of questions have come up over this chart
Yes, that High 15-20 point is total BS. I took for granted that this would be obvious. I probably shouldn’t have.
Personally, I don’t like looking at the data this way, but I know most people are more visual than I am so I thought including it would be a good way of describing the general trend in a more visual context. The problem you’re going to run into when displaying data like this is that you’re creating thinner sample slices, which can dramatically up your error estimates.
In the Radiant vs Dire table, we can see that the sample size for High 15-30 is 1763. This isn’t huge, but it’s workable. More importantly, the trend that High 15-30 displayed was repeated in Normal 15-30, Very High 15-30, and 15-30 in all brackets in the 6.74 sample. Given all this we can be reasonably confident in a ~56% Radiant win rate at that time period.
But if we’re slicing down to 15-20, well, we’re looking at 1/3 the area, so 600 is the max expectation for sample size, and 600 isn’t very good. What’s worse is that 600 would be an extremely generous estimate. Why? Let’s visit the wayback machine for that.
Match duration appears to follow a normal distribution, and 15-20 minutes is on the extreme edge of that distribution. Without even looking at the sample numbers, we can expect samples around 40 minutes to be the largest, and therefore the most reliable. Samples under 20 minutes or over 60 minutes will be tiny and therefore prone to extreme error.
And as it turns out, I did actually look at the sample sizes for the chart yesterday, and the entire 15-20 bracket was composed of only 200 games at every skill level. In the interest of clarity I should have probably just started the chart at 20 minutes, but oh well, mea culpa.
I do have one more thing to add about Radiant vs Dire. A lot of people have responded with their theory on how to explain it. That’s great. I encourage this. But just keep in mind that your theory, no matter how great it sounds, is still unproven. “Explanations exist; they have existed for all time; there is always a well-known solution to every human problem — neat, plausible, and wrong.”
On a whim I watched Gabe Newell’s acceptance speech at Games Fellowship 2013. Everyone else is ooohing and aaaahing at 3.5 terabytes per second or whatever. Meanwhile, my reaction was an angry, “No! It was T.S. Eliot!” Props anyway to mentioning Super Mario 64 as one of his three favorite games. I feel it doesn’t get enough love for being the first of Nintendo’s only 3 good 3-D conversions.
Finally, DotaMetrics hit 100k views earlier this week, which I guess is some kind of a milestone. Thanks for all the support, and as a reward, have some sexy, wrestling Invoker.