[SBA] Necrolyte — Heartstopper Aura Outperforming Sadist

February 28, 2013

Might as well just tl;dr myself in the title.  You all know how this works by now (if you don’t, check out the beginning of the Silencer Skill Build Analysis), so let’s cut to the chase.

Necrolyte: Dota 2 Wiki page

Q – Death Pulse: Hero-Centered AoE Nuke and Heal

Scaling — Damage, Healing, Cooldown, Increased Mana Cost

W – Heartstopper Aura: Passive HP Removal to All Enemies in 1,000 Unit Radius

Scaling — Health Lost per Second

E – Sadist: Temporary HP and Mana Regen from Creep and Hero Kills

Scaling — Regen Rate

And the basic usage table

[Necrolyte]Builds

It’s no surprise that Death Pulse(Q) starts are as dominant as they are, but it does make problems for our Win Rate chart since most of it is now greyed out and unreliable.  Notice how E->W goes from 0% to 100% between High and Very High?  That’s because there are literally zero E->W builds in High and only one in Very High.  If we want to make sense of Necrolyte’s build priorities we’re going to have to look exclusively at Q Prime builds.  There’s just not enough of the rest to judge them reliably, but from what we have the outlook is not especially positive.

Also of note, Necrolyte is a pretty successful hero in public matchmaking, though this success wanes in Very High.  Neither fact is surprising.  He’s been significantly above average in win rates for as long as I can remember.  He’s a simple and reliable source of teamfight contribution between Death Pulse and Heartstopper Aura, but better players will find ways to punish him early on if you leave him 1v1.  This also makes it difficult for teams to lane him in tournament play.

But anyway, we know that Q Prime is the dominant build, so we need look at the success of the sub-builds.   The chart suggests that despite being less common, Q->W outperforms Q->E, so let’s look at it in detail.

[Necrolyte]byPoint9

Since this chart is based off Q Prime builds at level 90, that leaves 4 points available to be split between W, E, and Stats assuming the builds take their ult (spoiler alert, the vast majority do).  What we see is that at every level Death Pulse -> Heartstopper outperforms Death Pulse -> Sadist.  This advantage gets significantly smaller in higher skill level games, likely because Sadist requires good last hitting skills to be of any worth early and Heartstopper is perfect for those terribly passive 2v2 lanes that lower level players run all the time.

Based off this, I’d suggest that both Heartstopper and Sadist are viable as secondaries.  If you’re in a low level game or a game where you won’t be getting much CS for whatever reason, you’re probably better off with Heartstopper.  Otherwise it’s up to your own personal judgment, but Death Pulse -> Sadist builds appear to perform better when they treat Heartstopper as a one point wonder rather than holding off until 10 to level it.  Keep in mind that the first point you put into Heartstopper is twice as effective as every additional point.

I’ve made this case before, but Heartstopper is the kind of skill that people undervalue because they don’t immediately see it’s effects.  Yeah, it’s a small amount at the beginning of the game when health pools haven’t filled out.  But .6% of 600 is 3.6.  3 iron branches are worth 57 hp.  After just 16 seconds of standing within 1000 units of a 600 HP opponent you’ve nullified 3 iron branches worth of HP, and 50 HP alone is more than enough to reverse the outcome of an early game fight.  And don’t even start with the “It barely counters their natural regen” line.  That one’s wrong on so many levels it doesn’t even deserve a response.

The real take home from all of this is that Heartstopper is underrated as a skill.  This doesn’t mean that you must go Pulse->Heartstopper or you’re doing it wrong, but it does show that Pulse->Heartstopper builds are surprisingly viable and that even for a Pulse->Sadist build you should consider grabbing that first point of Heartstopper sometime before 5.

This will probably be the last SBA for a week or two.  I’ve been working on some other stuff, and until the API gets changed I can only grab so many matches at a time.  Tune in next week for some (hopefully) interesting stats about overall hero usage rates plus a new roundabout study into the sizes of the Normal/High/Very High brackets.


[SBA] Huskar and Ancient Apparition -or- Maybe Burning Spear Isn’t So Terrible After All

February 24, 2013

When I changed methodologies on my skill build analysis a week or two ago, one of the driving reasons was that the old version didn’t work so well when it came to heroes with very similar early game builds but divergent late game builds.  Huskar and Ancient Apparition aren’t particularly hot in either pubs or tournament play at the moment, but they’re useful to me as examples of how this new analysis method is way more flexible.  In these cases, the big question for both of these heroes is whether their often neglected skills Burning Spears and Chilling Touch are even worth taking at all.  The answer, particularly in Huskar’s case, is a bit less than straightforward, but I’m getting ahead of myself.

Let’s start with Huskar(dota2wiki page):

Q – Inner Vitality: Scaling Single Target Heal

Scaling — Rate of Scaling

W – Burning Spear: Magical DoT Orb Effect

Scaling — Damage per Second

E – Berserker’s Blood: Stacking Damage and Attack Speed Passive per HP Missing

Scaling — Attack Speed and Damage per 7% HP

Now for the table:

[Huskar]Builds

There are two big stats I need to draw your attention to:

The first is that Huskar’s win rate drops pretty significantly the higher you go in the bracket (48.23% -> 45.78%).  This is hardly surprising, but it becomes important later on.

The second is that maxing Berzerker’s Blood(E) first definitely appears to be the way to go.  This should be even less surprising, but just to really hammer home the point, I made a quick little chart comparing the win rate of Huskar in games where he maxes E by level 8 to games where he doesn’t max E .

[Huskar]4Eby8

So Huskar’s in the Normal bracket who do not max E see their winning percentage drop by 8.5 percentage points, which is a pretty huge dip.  It also points out that Huskar’s performance gap between Normal and Very High is ~5% when you look exclusively at builds that max Blood first.

With Huskar’s ideal primary easily determined, the real question about Huskar builds is how he should spend the rest of his points.  To examine this more closely, I created a modified version of the Win% by Skill Point table from the previous SBAs.  But instead of looking at Q/W/E by level 8, I restricted the search to builds that max Blood first and looked at Q/W/Stats by level 10.  Since these builds have all spent 4 points in E and the vast majority one point in Huskar’s Ult, this leaves 5 points available spread between our three options.  So how did things shake out?

[Huskar]byPoint10

At all levels, Blood(E) into Inner Vitality(Q) is the most popular.   Blood into Burning Spear(W) is the next most popular, but loses ground as you move up through the brackets.  Blood + Stats is always fairly rare, but still reasonably popular.  The win rates paint a rather curious picture.  At Normal and High, E+Stats is the strongest followed by E->Q, but in Very High E->W looks like the clear winner.  In light of this, I did another focused test comparing E Prime builds with 4 points in Burning Spear by 10 to E Prime builds without 4 points in Spears.

[Huskar]4Wby10

It doesn’t occupy a huge percentage of his Very High usage, but E->W definitely experiences the least Very High Win Percentage Decay of any of Huskar’s major builds.  It’s not definitive since these builds make up a relatively small portion of the entire sample (~487 games total in Very High), and it could be a trick related to E->Q just being more popular.

On the other hand, it could be that E->Q and E+Stats do well in the Normal bracket because the HP edge helps Huskar win 1v1 fights reliably and snowball from there.  In Very High the importance of 1v1 fights is diminished and builds that max Spear second have better overall damage output for skirmishes, early teamfights, and non-Ancient Neutral farming.  Maybe.  Whatever the case, there appears to be nothing wrong with a 0/4/4/1 build.  Other builds appear to outperform it in the lower brackets, but it remains resilient in all 3 brackets.

Now Ancient Apparition on the other hand paints a much more straightforward picture.  For the sake of time I won’t describe his abilities, but you can check them out on his Dota 2 Wiki Page

[AA]Builds

Cold Feet(Q) -> Ice Vortex(W) is the unambiguous winner here.  So is Chilling Touch(E) still a skill worth skipping for stats?  To test this we just replicate the tests we did with Huskar, only this time we look at QW builds at level 10 and see how they fare when they’ve put points into Chilling Touch.

[AA]anyE

It’s not a huge effect, but it does look like Chilling Touch is might still be a bit lackluster.  It does do well enough in Normal though, and it’s at worst not an active hindrance.

What needs to be emphasized though is that this is not an argument that Chilling Touch should never be skilled; it’s at best an argument that it usually does not pay off to skill it early.  As always, there could be situations where early Chilling Touch is actually quite viable but that are statistically rare enough that they aren’t showing up in my analysis.  What this means is that if you have a good argument for trying to integrate Chilling Touch early, by all means test it out.  If you don’t any reason to believe that Chilling Touch is going to work particularly well in a given game/lineup/lane matchup, then you’re probably better off playing it safe with a standard Q->W->E build.

So to summarize:

  • Always max Berzerker’s Blood first on Huskar
  • Blood -> Stats appears to be the strongest Huskar build at low levels of play, followed by Blood->Inner Vitality
  • If you’re doing either of those builds, only grab 1 point of Burning Spear if you need it for specific orb-walking purposes (such as attacking through CM’s Frostbite)
  • On the other hand, 0/4/4/1 looks perfectly viable for Huskar, especially at higher levels of play.
  • Ancient Apparition appears to generally perform the best with a straightforward Q->W->E build order.  The E vs Stats comparison is inconclusive, though Chilling Touch does appear to be the stronger option in lower bracket play.

Rate-Driven Match Analysis

February 23, 2013

I talk a lot about how end of match data isn’t always good enough.  What I really want to see are the growth rates of CS, GPM, XPM, and Net Worth, along with the order and timing for item progression and the match context of individual Kills.  datadrivendota on reddit put together a pretty cool image set describing a match.  This is the kind of direction I’d like to take match statistics in, though the presentation currently is still a bit rough.

Ideally there would be two variants.  The first would be specific to the match, highly visual, and user friendly.  You search for a small set of matches that interest you, like say Meepo tournament victories, and then get a detailed breakdown of how each match developed over time.

The second variant would be highly abstracted.  It wouldn’t be very useful for looking at specific matches, but it would allow you to easily compare a huge set of matches in useful ways.  Say we take 5,000 Meepo matches in the Very High bracket.  We could then try to determine what features most impact his success based on measurements of their creep killing patterns, teamfight participation, initial laning location, teammate behaviors, item progression, etc.

For now the API will do, but it’s always going to be limited compared to the power of detailed parsing.  Even then, you can still use the API to quickly create a controlled sample of games that have the features you’re looking for, so it’s never going to become completely obsolete.


Silencer Skill Build Analysis

February 19, 2013

With Silencer being enabled in both Captain’s Mode and Tournament lobbies, I thought it’d be a good time to examine his popular skill builds.  He also saw a partial overhaul in 6.76, so it’s certainly worth a look at how the public have adjusted their builds in the intervening months.

Before we get to that, I need to mention that I’ve changed my build grouping scheme some.  The big problem I was running into with the Single/Double/Split classifications is that a lot of interesting stuff was happening precisely on the edges of my categories.  Take a 4/4/1/1 build.  It’s fairly common among many heroes, but in my old scheme it could be grouped as a Single Q, Single W, or Double QW depending on the path it took to get there.  There was no easy way of saying “This is classified as a Single Q build, but actually looks a lot like a Double QW build,” and that was a clearly an undesirable outcome.

Instead I’m now using a Primary->Secondary classification.  Primary is what skill gets maxed first by level 8.  Secondary is what skill gets maxed second during levels 8 through 10.  If no skill gets maxed during those periods, I classify it as a split build.  Basically we’re left with 4 primary groups (Q,W,E,Split) and each of the non-Split primary groups having 3 subgroups depending on their secondary (skill 1, skill 2, split).

I re-did the Crystal Maiden analysis using the new method and here’s what it looks like.

newCM

The major divisions are bolded and in a larger font.  On the usage chart the sub-categories are measured by the percentage of the primary they represent.  For example, Crystal Maiden Aura(E) builds in Very High use Frostbite(W) as their secondary nearly 40% of the time, so (.20*.38 = .076) 7.6% of the Very High sample is a E->W build.

One useful thing that this build style allows for is that you can easily compare similar Primary/Secondary combinations.  For example W->E and E->W perform very similarly in Very High, with 63.19% and 63.20% win rates respectively.  It also let’s us point out that while Crystal Nova(Q) primary builds as a whole don’t do especially well in Very High, they do seem to do alright so long as they put significant points into E.

On a related note, many of you complained about excessive use of Q/W/E notation in the Crystal Maiden article.  I chose to use Q/W/E in my code so that it could be generalized across all heroes, and since I end up spending a decent amount of time looking at that code thinking of the abilities as Q, W, and E becomes a bit habit forming.

In light of that, I’m going to make a concerted effort to use the names more often in upcoming articles.  However, due to space concerns the tables with still use Q/W/E formatting, and I’ll probably tend to refer to builds in a Primary->Secondary notation using Q/W/E.  Hopefully I’ll actually use the names often enough that these exceptions won’t be a problem.

But enough about that.  Time for Silence.

First a quick ability rundown:

Q – Curse of the Silent: AoE Damage and Mana Drain DoT; Broken by Spellcasting

Scaling — Damage, Mana Drain, Cooldown, and Mana Cost

W – Glaives of Wisdom: Pure Damage Orb Effect; Scales with Intelligence; 1 Point Allows You to Steal Int on Kills and Assists

Scaling — Damage Scaling

E – Last Word: Single Target Nuke with Silence/Disarm Components

Scaling — Damage, CC Duration, Cooldown

His recent patch changes are pretty substantial, so if you’d like to check them out or just want more detailed ability descriptions check the Dota 2 Wiki Silencer Page.

Now to get back to the tables.

[Silencer]Builds

Unlike Crystal Maiden, Silencer is all over the place.  Nevertheless, two clear trends stand out immediately.

First, Curse of the Silent(Q) gets both less popular and less effective as you move up the brackets.  Q Primary builds as a whole see the clearest decline in win rate outside of the mostly ignorable Split group.

Second, Last Word(E) builds get way more popular as you move up the brackets.  E Primary goes from just under 1/3 of Silencer’s builds in Normal to nearly 2/3 in Very High.  Q->E and W->E builds also increase in popularity in the upper brackets.

What I think is happening here is that Curse of the Silent(Q) shines in Normal laning.  Partly this is because low level laning has a lot of drawn out 2v2 sidelane battles, which is a much stronger environment for Curse than any lanes involving solos.

On top of that, higher skill players know what to do to counteract Curse of the Silent.  This means that if you want to use it effectively you need to have Last Word(E) there to keep them from just breaking it immediately.  If you have a maxed Last Word and can hit them with Curse just as it triggers, you’re guaranteed a near full-duration Curse, and fully maxed out that Last Word -> Curse combo is worth 690 damage pre-mitigation.  Of course it’s often not going to be that simple, but having a strong Last Word appears to be important in landing long-duration Curses against opponents who know what they’re doing.

In light of this I think we have two build styles.  The first is nuke based and maxes Curse of the Silent(Q) and Last Word(E) in some order.  Curse first seems to work best in lower level games.  Higher level games tend to prioritize Last Word.  One point in Glaives of Wisdom(W) is typically grabbed relatively early to allow for orb-walking and intelligence steal.

The second style is more focused around being a right click semi-carry.  This style maxes Glaives of Wisdom(W) early and usually ignores Curse of the Silent(Q) in favor of some combination of Last Word(E) and Stats.  What’s interesting is that Glaives first gets significantly more successful in the upper brackets.  The most obvious explanation for this is that the the Normal bracket tends to have more carries per team and nuke-based Silencer is a better choice in an environment with low or unreliable farm.    Glaive builds are also likely much more gold dependent and lower level Silencer’s might not have the last hitting to finance an effective Glaive build.  The best performing right click build in Normal appears to be E->W, where you’re still investing into strong lane control and can get a feel for the pace of the game before committing fully into skilling Glaives.

The farm explanation is still speculative of course, but I do have some backing evidence.  I created a measurement that was essentially Silencer’s Level divided by the Average Level of the rest of the team.  Basically if the number was higher Silencer was likely being used in a Semi-Carry role.  If the number was lower, Silencer was being used in a Supportive role.  I then creates the tables based on both the top 40% and bottom 40% to get an idea of how things shifted depending on Silencer’s farm priority.  You can check out the results here, but the summary is that Glaive builds are definitely more common in the Semi-Carry sample.

Like last week we have a by-Point table, but this one isn’t nearly as useful as CM’s

[Silencer]byPoint

Only two interesting things that stick out to me.

The first is that 1 point in Glaives by 8 is the plurality standard.  The second is that while the trends are mostly worthless, W(Glaives of Wisdom) in Very High is pretty interesting.  In that bracket is appears that you should either max it by 8 or leave it at 1.  Anything in-between underperforms.

Finally we have the clusters, and again, they’re a bit weak.

[Silencer]Clusters

1-1-3 stands out as the safest overall build, but Curse heavy builds keep up or outperform it in the lower brackets.  Openings that invest heavily in Glaives are enough of a minority that no Glaive-maxing level 5 cluster was popular enough to break the 5% threshold.

Finally, stat builds appear rare, but decently successful.  Most of these are probably Glaives/Stats builds that ignore Curse entirely, but I didn’t peer into this to make sure.  Ult Skipping is relatively rare, but not especially unsuccessful.  Maybe Global Silence is situational enough in public matchmaking that you can afford to hold off on it a level or two if you have other more immediate priorities.  As for Silencer’s overall win rate, it appears to decay a bit, dropping from ~52.5% in Normal and High down to 51.5% in Very High.  This is consistent with the general trend that lane bullying is a much more effective strategy in low level games, and Silencer’s kit is nothing if not annoying in lane.  I don’t think this is in any way an indictment of Silencer’s competitive viability though.  Global Silence is an ult that shines in environments where you’re communicating with your team, and is therefore likely undervalued in public matchmaking.  Besides, it’s not like his Very High win rate is bad, it’s just slightly below his overall win rate.

So to summarize the build philosophy:

If you’re playing in a low level game or don’t have a reliable source of farm, Curse/Last Word builds with a point in Glaives appear to be the strongest.  The priority between Curse and Last Word is situational, but in general Curse performs better in low level games and Last Word performs better in high level games.

Glaive builds are viable if you have a safe lane to farm and have decent last hitting.  Most successful Glaive builds neglect Curse in favor of Last Word and Stats.  But even if you do have good farm, a Curse/Last Word build might be more appropriate for your team comp, so keep that in mind before committing one way or the other.

If you’re not sure what kind of a game you’re looking at, the safest build is to go 1-1-3 and adjust from there.

Edit: I forgot to mention that the sample size for this one was ~5,000 games per bracket.


Crystal Maiden Skill Build Analysis

February 11, 2013

Last week I talked about building better skill build analysis tools.  Shortly after finishing that post, I decided to grab some new samples and build a working prototype.  Instead of doing it with Ursa though, I decided to go with Crystal Maiden.

A big part of this is that I expected Crystal Maiden builds to be a bit all over the place, and she did not disappoint.  The other part is I had just read a conversation about the value of Brilliance Aura and whether it was worth skilling early, and it occurred to me that I didn’t know.  Sure, I had inclinations, but on reflection I wasn’t really that set in them.  I also didn’t have a clue how to value the change in 6.76 that doubled the personal benefit of the aura.  In an ideal world I would have pre-patch and post-patch samples and could do a direct comparison, but the Ability Usage section in the API is far too recent to manage that.  It is however an interesting idea to keep in mind for the major changes of future balance patches, but I digress.

Sample creation was pretty simple, if a bit disappointing.  For the moment I can only get 500 samples a day per skill bracket, and the samples are not randomized for time.  Hopefully this will change in the near future.  In the meanwhile, I could only grab a sample of 3000 matches in each of the skill brackets (normal, high, very high).  After filtering out useless and broken matches I was left with ~2500 Normal matches and most of the high and very high intact.  At some point in the next few days I may go back and expand the sample.  By the end of the week, another ~7000 matches should be available.

The first step was to divide the matches by the skill build categories that I outlined last week.  If you want to read about them in depth go to the link, but for now, here’s the short version.

Single-Skill: Strongly prioritizes a single skill.  Example builds include 4/2/0/1 and 4/1/1/1.  Comes in Q, W, and E variants.

Double-Skill: Splits attention primarily between two skills.  Example builds include 4/3/0/0 and 3/1/3/1

Split-Skill: Spreads attention fairly evenly among all 3 non-ult skills.

If you want to know the rules I’m currently using to make these divisions, here’s a flowchart describing the process.

Update: By popular request, here’s a quick write up of what Q, W, and E actually are.

Q – Crystal Nova: AoE Nuke and Attack/Movespeed Slow

Scaling — Duration, Damage, and Mana Cost

W – Frostbite: Single Target semi-Stun and DoT

Scaling — Duration, Damage (non-linearly), and Mana Cost

E – Arcane Aura: Passive Mana Regen Aura

Scaling — Mana Regeneration Rate

Future versions will hopefully be a bit more thought out.  In the meanwhile check out http://www.dota2wiki.com/wiki/Crystal_Maiden for more detailed information.

Using this method, I created a table of usage and win rates for each build strategy in all 3 brackets.

Usage is pretty unsurprising.  Q heavy builds get more popular the higher you go up.  Split builds get less popular.  What is surprising is the win rate chart.  Builds that prioritize W and E seem to do significantly better in the higher skill brackets.  I wasn’t expecting this, and so I made some extra tests to check and see whether I did something wrong.

Before moving on, I’d like to note a couple of things

  • Average win rate of Crystal Maiden on Dotabuff over the 6 days of the sample was 51.51%.  Weighted average of my sample (using .8, .15, .05) was 51.42%.
  • Early Stats is insanely unpopular, and almost certainly a dumb idea.
  • Late Stats win rate is, like the asterisk note says, unreliable.  Don’t use it for anything.  I’ll probably regret even including it.  Damn you symmetry.
  • Ult Skipping (not taking Ult at 6) gets significantly more popular in the higher brackets.
  • I include the 8 and Over Win Rate because most of my tests only include samples where CM hits at least 8.  This is the majority of the samples, but it’s still for the best to use the 8 and Over Win Rate as the comparison point.

What I tried next was to just group the builds by how many points they have in each ability by level 8.  This would hopefully establish some trends in how valuable each point in an ability tends to be worth.

This appeared to confirm the initial findings.  Brilliance Aura, Crystal Maiden’s E, has the strongest positive slope in all 3 brackets.  Crystal Nova, Crystal Maiden’s Q, has the most negative slope.  I don’t know whether I like this method much, but as a relatively independent confirmation it seems to work.

Finally, I tried a new method out not related to the skill build groupings.  Instead, this one creates a skill cluster at a certain level while ignoring build order entirely.  For the purpose of this sample I chose level 5.  Larger samples in the future might allow me to be a bit more aggressive in choosing a later level or different combinations, but for now I’m trying to stay conservative in order to maintain sufficiently large sample sizes for each sub-category.

This shows the use and win rates of the top 6 level 5 clusters in each skill bracket.  Once again, W and E heavy builds tend to perform the best.  Also of note, builds that leave any of CM’s non-ult skills at 0 are very unpopular in all brackets.

Are W and E Crystal Maiden’s strongest early skills?  It’s possible that this analysis is incomplete.  Maybe the sample is too small.  Maybe the sample is biased towards the build patterns of a small selection of people who play regularly around the sample gathering times.  Maybe there’s just a bug I haven’t caught yet.

On the other hand, my findings seem to match Dotabuff’s Crystal Maiden Skill Build Page, and they almost certainly do not have issues with sample size or sample bias.

Another possibility is that Q builds under-perform because they’re so popular.  Negative correlations between popularity and win rates are certainly not unheard of.

Finally, maybe W and E really are her strongest early skills.  E is significantly better than it was a couple months ago, and we tend to drastically underestimate the value of global auras.  The Q to W comparison might also not be that bad if you think about it.  Both abilities give you an extra half a second of their respective CC, and an extra half second of W’s semi-root thing is likely more valuable than an extra half second of a 30% slow in an early fight.  In a single target scenario the damage per rank between the two abilities is actually relatively even: 150 for three ranks of Q, possibly 140 for three ranks of W?  W does 70 damage per second so I’m not sure if each rank is worth a flat 35 damage (for 105 total) or if you only get 70 at each new second, which gives you +70 at 2 ranks, 0 at 3 ranks, and +70 at 4 ranks.  playdota doesn’t explain how the scaling works, but losing 45 damage in the worst case scenario might be worth extending your stronger CC.

In any case, I’m not going to commit to saying that 1/4/4/0 is the strongest CM build, particularly because it seems to be only average in Normal level games,  but I do intend to keep this in mind for the next time I play her.  W and E focused builds are at the very least worth experimenting with.

But moving beyond Crystal Maiden, during this analysis I’ve attempted to make the tools I’ve created generalizable for all heroes (with the exception of Invoker).  Sample creation still has some issues, but theoretically all of this can be used on any hero, so if there’s anyone in particular that you’d like to see get this treatment feel free to bring them up in the comment section or through e-mail.


Better Skill Build Analysis

February 4, 2013

With Skill Builds being the big new feature of the latest API upgrade, I thought I’d talk a bit about ways to use this information more effectively.  I don’t intend to touch on the technical hurdles of gathering and storing the information and that includes the technical feasibility of my own suggestions.  If something I suggest is impossible for some given system, then there might be a variant that is possible while still preserving the essential features.  For now, I want to focus entirely on finding a better way to categorize these skill builds for both display and research.

The best starting point is DotaBuff’s skill build system as theirs is the most developed public source that I know of.  To reach it go to the hero page, click on a hero, and choose the ‘Skill Builds’ tab.  For instance, here is Alchemist’s most popular skill builds.

I have two big complaints with these pages as-is.  The first is that it contains a lot of redundant information.  Let’s say I want to look at Ursa builds while researching my last articleWhat I find is that Earthshock first builds cannot be found.  I know it’s a niche build, but there may be enough samples out there to still find out something useful.  Unfortunately, DotaBuff only lists the top 10 builds, and most of those are slight variants of two major build philosophies.  Some of these minor variants might be important, but most really aren’t.  And either way, it’d be preferable if the main page would list the major build styles and then have links that go to the minor variants of each build style.  Solving this requires developing a way to group similar builds together, and preferably one that can be generalized across all heroes (perhaps with the exception of Invoker).

As for the second complaint, look at the numbers on any hero page.  The total build rates of the top 10 rarely ever add up to over 20%, and all the win rates are 10-15% higher than the base win rate for the hero in question.  What I believe is happening is that DotaBuff is only counting the skill builds of players that reach level 18.  This inflates the win rates of the builds because heroes that hit level 18 or higher are more likely to be a higher level than the opponents compared to heroes that end the game somewhere between level 1 and 17, and typically the winning team will have a higher overall XPM (they’re also less likely to have lost a game due to an early abandon).  What this highlights is we want a system that can categorize a skill build well before level 18.  We’d also need to design our system so that we can control for certain factors in the way we measure and display results.  As an example, if I’m right that DotaBuff only uses level 18 skill builds, the page for each hero should display that hero’s win rate among all eligible matches.  That is, it should display the hero’s win rate in all matches where they end the game at level 18 or higher.  Given our limitations, this provides a better benchmark for judging these skill builds than just comparing their win rates to the global win rates of the hero.

With these challenges in mind, what I propose is we categorize skill builds by there skill priorities between level 7 and 9.  This allows us to include most heroes in most games that last at least 20 minutes, and in my opinion these levels contain the information for the most important skill decisions in the vast majority of skill builds.  My proposed divisions are Single-Skill Priority, Double-Skill Priority, and Split-Skill Priority.

(For upcoming reference, my skill build listings will be something like 4/2/0/1.  The first three are Q/W/E in any order.  The last is R.  At this stage we have no particular reason to distinguish between ‘Max Q, Ignore E, Rest of Points in W’ and ‘Max W, Ignore Q, Rest of Points in E.’  We can generalize the process to work for any variation.  If it comes up I will use +X at the end to indicate early points in Stats.)

Single-Skill Priority is probably the most common skill build and represents every skill build that maxes one of its skills by level 7 (or that maxes one skill before any other skill gets 3 points or before both other skills get 2 points).

Double-Skill Priority is any skill build that has two skills with 3 points by level 7 (or two skills have 4 points before the third skill has 2 points)

Split-Skill Priority is any skill build that has no skill with 3 points by level 5 (or every skill has 2 points before any skill has 4).

It should be noted that these distinctions are not exhaustive.  There are some potential disagreements (3/3/0 becoming 3/3/2 for example), and they might not actually catch all skill builds, but they should handle the vast majority.  Any major variants that fall outside of these rules can be integrated through adding additional rules, but I’d like to see how the most simple rulesets perform before any adjustments are made.  For the sake of completeness I included some potential adjustments inside the parentheses.

What we’re left with is 7 major categories.  3 variants of Single-Skill (Q/W/E), 3 variants of Double-Skill (QW, QE, WE), and a sort of other category with Split-Skill Priority.  What we’d expect to find is that not all 7 major categories are necessary for most heroes.  For example, Single-Skill E Skeleton King is almost certainly statistically negligible.  I’m not going to specify any particular criteria for eliminating a category as statistically irrelevant.  We want to be as concise as possible, but we also should be extremely interested in detecting rare but successful builds.  Whatever criteria we use should try to create a balance between these two concerns.  In any case, once we have eliminated the useless variants, we could create a basic pie chart that describes the distribution of each heroes’ skill build preferences.

To take Ursa as an example again, I’d expect to find Single-Skill E to be the most populous build followed closely by Double-Skill WE and Single-Skill W.  Single-Skill Q should have a small but significant population.  Most of the other variants will likely be statistically negligible.

With the major categorization out of the way, we now want to break each major build into its variants.

Single-Skill will tend to have one of three early structures: 4/2/0/1, 4/1/1/1, and 4/?/?/1 + stats.

Each Single-Skill build category also has 3 late build philosophies.  For Single-Skill Q they are Max W, Max E, Split Between W and E.

I define Max W and Max E as any build that puts 4 points in one of these skills before putting 2 points in the other.  They each come in two variants, hard and soft.  The hard variant skips the third skill entirely, while the soft variant puts 1 point in the third skill at some point before maxing the second.  For example, if we had Single-skill Q — Hard Max W, that would be any build that starts 4/2/0/1, and eventually achieves 4/4/0/1.  Soft variants could begin 3/1/1, or they could start 3/2/0 and put the first point in their third ability somewhere between levels 6 and 9.

There’s also the issues of +stats and ult skipping.  In both cases I would ignore these until they happen enough within a major variant that they’re statistically noteworthy.  For some examples:

  • Single-Skill Q Juggernaut and Skeleton King should both have statistically significant +stats populations
  • Various farming carries (Anti-mage, Phantom Assassin, Medusa) might have early stat builds so extreme that they’re effectively Split-Skill Priority builds.
  • Various Leshrac Double-Skill builds have large amounts of ult skipping.
  • Tinker and Dragon Knight might have noteworthy amounts of ult skipping (or in DK’s case 2nd point delaying) in all of their builds.

+stats and ult skipping may need to be handled on a case-by-case basis.  What we’d like to do regardless is be able to measure the percentage of builds that include early +stats, late +stats in lieu of maxing a skill (think of Ancient Apparition’s Chilling touch pre-buff or the many Huskar builds that skip Burning Spear entirely), and ult skipping, and then also be able to measure those percentages within each of the major and minor build variants.

This is actually the big drive behind creating a grouping system for skill builds.  Sure, it could help improve sites like DotaBuff find a more effective way to display information, but the real benefit is so that we can separate two skill build groups or sub-groups and make comparisons.  What kind of things can we learn from these comparisons?  Let’s return to Ursa as an example.

One really cool thing we could do is track the usage rate of a skill build category by skill brackets.  For instance, if Single-Skill Q Ursa is a niche build overall, but becomes significantly more common in High and Very High games over the stretch of time following 6.75, we could be detect the emergence of a new build before it becomes public knowledge.  There’s really a significant amount of potential here.  Does Single-Skill Q + Stats Juggernaut have a higher popularity in the upper brackets?  What about builds that max Healing Ward early either part of a Double-Skill build or a Single Skill Q with a hard Max W?  Do the variants of builds that feature primary or secondary maxing of Crit become less popular or less successful?

But let’s even go beyond that.  We can create statistical distributions for each group and sub-group and compare them.  Suppose we’re looking at Leshrac.  We know he gets played both as a support and somewhat more rarely as a semi-carry.  We can create a stat called ‘Relative XPM’ that represents Leshrac’s ending XPM / rest of the team’s Average XPM.  Games where this value is high, Leshrac was most likely being played as a semi-carry.  When it’s low, he was likely played as a support.  Then we create a Relative XPM distribution for each skill group to determine which builds are seen as semi-carry builds and which as support.

With a little more effort we can do something similar with Ursa.  Replay parsing can differentiate between lane and jungle farm, so we use our skill groupings to create replay samples of Ursa games in each of the four big categories (Single-Skill Q, Single-Skill W, Single-Skill E, and Double-Skill WE).  We can parse the replays to find out where each category tends to get most of its farm in the first 5, 10, and 15 minutes by basically creating a ratio of Jungle Creep Gold to Lane Creep Gold (and of course programming some error-catching for divide by 0).  This would tell us the percentage of the time each build gets used to jungle or lane.

From there we could also measure average jungle efficiencies of the builds or even the time-to-first-Roshan.  This could give us a much better idea of how these different builds compare in their efficiency at neutral killing than we currently have.  We could also use this to detect tactical outliers.  Suppose Single-Skill E on average beats Single-Skill W in Jungle Creep Gold in the first 10 minutes, but that there’s a small cluster of Single-Skill W replays that are comparable to the best Single-Skill E results.  We now have a collection of replays to watch we can watch find out why these Single-Skill W builds are outliers.  Maybe there’s an obscure tactic that let’s Single-Skill W compete that we could adopt to give ourselves greater skill diversity.  Or maybe it’s just a matter of luck with jungle spawns.  Either way, we’ve accomplished a lot by using statistical outliers to separate out interesting results.

The basic takeaway is that whatever system we use to measure skill builds, we want a system that supports a form of reverse lookup.  That is we should be able to take our list of matches and generate the distribution of skill builds, but we should also be able to describe a skill build category and create a slice of the total matches where every match features that skill build.

Data display is an entirely different issue (that I happen to feel a bit less passionately about), but the summary for that is as follows.

  • 7 broad skill build categories — 3 variants of Single-Skill, 3 variants of Double-Skill, and Split-Skill
  • Each broad category has sub-variants depending on how it behaves before 6, after 6, and whether it ult-skips or takes early stats
  • Remove the statistically insignificant categories and we can create a basic chart for both the broad skill categories and each sub-variant.
  • If necessary, we can then do the top 5 builds for particularly popular sub-variants.

And admittedly I have no idea whether this will all hold up in practice.  In fact I suspect much of it won’t.  I’m just presenting it as a possible starting point out of which a better system can evolve.