The Insignificance of Pub Stats, Part 3

For the final installment, we’re going to examine why having a visible MMR does not lead to better competition

For Reference:

The Insignificance of Pub Stats, Part 1 — how popularized statistics change the nature of flaming in public matchmaking

The Insignificance of Pub Stats, Part 2 —  why KDA is a bad stat and what we could replace it with

Statistical Significance: The Value of Pub Stats — the article this is in response to

—————————————————————————————————-

Ever since online matchmaking became ‘a thing,’ gaming communities have become obsessed with treating the matchmaking rating system as a panacea for all skill evaluation.  After all, that’s what the system is for.  It determines your skill level, gives it a number, and matches you with people like similar numbers.  So if I had a list of those numbers I’d have a straightforward hierarchy of the best players in the game down to the worst, and in order to improve my own skill all I have to do is find things that make the number bigger, right?  Unfortunately, no.  MMR is a useful tool, but it’s not the limitless font of knowledge we make it out to be.  And just like KDA we can limit our own development when we focus entirely on raising our MMR.

For an example of the ridiculous powers ascribed to MMR let’s go directly to the article:

Consider also that the sheer existence of publicly available statistics helps to ward off ignorance in the rest of the community. Blizzard, like Valve, has adopted a guarded approach to statistics. In StarCraft II, users are sorted into leagues, but nowhere does the game explain the league system clearly enough for a player to know which percentile their league corresponds with. The only statistic shown is a player’s total wins, not their win ratio or losses. In a blog series on TeamLiquid, a higher level StarCraft II player conducted an experiment where he began smurfing in the lowest league (Bronze) using an extremely weak and fragile strategy, the worker rush. However he found astonishing success with the strategy. Even when he explained to his opponents exactly what he was going to do, and how to beat it, he still won matches. The author ultimately hypothesized that the issue with the Bronze players was that they simply didn’t realize how much help they needed because they had never figured out where they truly laid on the skill spectrum. He wrote, “Blizzard has engineered a system that in no way acknowledges failure. So when someone smashes them … they don’t understand why.” (Emphasis mine)

I spent much of today reading the Worker Rush blog series linked here, and the quote included in the article is indeed accurate.  However, the author of that series also says this a few entries earlier:

Players in the bronze league have convinced themselves that they are only in bronze because of some cosmic injustice. They are good players, they couldn’t be bronze, right? It’s just because of all the cheesers that they lose. If they could play macro games they’d be diamond for sure. It’s just that a bronze player has to prepare for even more all-ins than grand masters, because the players in bronze are less predictable. Well, that’s at least what they tell themselves. I thought the “forever bronze” meme was just a joke, but apparently it has a basis in reality: these people think they are stuck, the hopeless victims of some affliction that is anybody’s fault but their own.

He actually goes to some lengths to point out that many of the players he beats in bronze do in fact know what being bronze ranked entails, and still go to great lengths to convince themselves that they don’t really belong there.  Temporarily embarrassed diamond leaguers if you would.  There’s no reason that making their ranking even more obvious than it already is would somehow convince them otherwise.  League of Legends alone proves that people can rationalize away a number just as easily as they can a more vague placement system.  (Also of note, LoL’s ranked system appears to be moving to a more Starcraft 2 design in the upcoming season)

And on the other end of the spectrum there are many people in the dregs of matchmaking who are completely aware of the fact that they just don’t know how to play the game.  The idea that we’re somehow helping them by rubbing a number in their face designed to quantify just how terrible they are quite frankly just comes off as vindictive, as if they’re somehow shaming the community with their ignorance.

I mean think about it.  You show a person their rating and tell them it means they’re one of the worst players in the entire game.   Ok, now what?   You tell them they need to get their rating higher.  Ok, how?  Win more often.  Ok, how?  Raise your KDA.  Ok, how?

At no point have you given them any actual information on how to get better at the game.  All you’ve done is thrown some metrics around with absolutely no context whatsoever.  You’re not really accomplishing anything, and you certainly aren’t “ward[ing] off ignorance in the rest of the community.”  If there’s one thing you should take from the Worker Rush series is that teaching people how to play a complicated game can be insanely difficult.  The number of factors involved are immense, and adding a visible MMR is nowhere near enough to even make a dent in the problem.

So let’s say you concede that a visible rating isn’t going to spur competition at the bottom end of the matchmaking spectrum.  But what about the top end?  Surely a visible rating will give players a greater incentive to strive to move up the rankings and lead to people taking matchmaking games more seriously.  Possibly.  But you should take a moment to step back and ask yourself if this would really be a good thing.

All Pick is, by far, the most popular matchmaking mode in Dota 2 right now.  And competitively, All Pick is a joke.  It’s certainly a convenient mode.  That is, after all, why it’s so popular.  But when it comes to character selection, it’s a mode that rewards players for mashing down the most overpowered heroes before the other team can grab them and otherwise playing counterpick chicken with the initial creep wave.  Quite frankly, All Random would be a better competitive environment.

And let’s also face the facts that if you want to win reliably in Dota 2 matchmaking, the most reliable way is to form a good pre-made stack.  I don’t blame Valve for this.  Allowing group queues is at worst a necessary evil because some people simply cannot enjoy the game when teamed with complete strangers.  But trying to make the current system more competitive would only hypercharge this drive to form the most unbeatable stack you can using the people you know, and this would completely leave solo queue players out in the cold.

The reality is that yes, Dota 2 still needs more outlets for playing in a competitive environment, but some kind of visible rating isn’t the answer.  When people ask for a ladder or league setup what they’re really looking for is a formalized form of progression, like a sports league.  The rules for advancement are clear and you know what’s on the line when you enter the game.   Dota’s MMR on the other hand cheats.  It gives people massive accelerations when they create the account because it isn’t interested in a formalized progression.  Instead, it wants to get you as quickly as possible to a ranking where you won’t mindlessly stomp your opponents.  This is an unqualified ‘good thing,’ but it’s anathema to a ladder.

Instead of making some new solo queue visible ranking hardcore matchmaking extravaganza, Valve would be better off putting its efforts towards creating a ranking system for 5v5 teams and establishing support for player run leagues and in-house ladders.  That way, if you want serious play, you get your 5 together and you queue as a team in a draft environment.  As for player run team and in-house leagues, I feel they’re a better outlet than one mammoth ranked solo queue.  Each league can evolve to try to meet the unique desires of its player base, and it helps establish a sense of local community that is increasingly lost in modern online games.

The problem with making a ranked solo queue matchmaking is it’s too easy.  Why go through the trouble of forming a 5v5 team or joining a league when you can just mash the play button and see a number go up or down?  Well, plenty of reasons actually.  But the problem is that they’re good long term reasons and in the short term you want to just get your fix with minimal hassle.  By not implementing a ranked solo queue, Valve would be directing the competitive player base to other, more fruitful outlets that might otherwise struggle to maintain the population they need to survive.

Of course, Valve might still implement some kind of ranked matchmaking analogue, but I sincerely hope they don’t.  Matchmaking has been a boon for online gaming, and I certainly wish we had had it a decade ago.  But at the same time there’s something socially hollow to a game that’s nothing more than a series of random encounters.  This is a chance to create a better kind of gaming community.  It won’t be easy, but nothing worth doing ever is.

Advertisements

14 Responses to The Insignificance of Pub Stats, Part 3

  1. TC says:

    I would have liked to see just one more thing relating to MMR. If you ever looked at teamliquid forums a while back, there were posts after threads about ladder anxiety. People (and myself too) get freakishly nervous playing 1s on the ladder. The reason, in my opinion, was because Sc2 showed you where you ranked and your record (the w-l originally, then they dropped losses for less than masters divisions). It was important to maintain image. After all, in Sc2 conversations, league mattered. I’ve heard one friend tell another that he didn’t know the game because he was in diamond (and not masters). The fact that one’s relative status was on display made people wary and created anxiety. It’s why I dislike that Valve allows people to see normal/high/very high. On the other hand, DotA has something going for it too. You can blame others. It’s the self serving bias. I won the game. My team lost the game.

    Sc2 didn’t have that. It required mental fortitude to play. People didn’t have the fortitude and we ended up with rampant complaints about anxiety. I’ve never been a big fan of broodwar, but perhaps they did one thing right, they allowed you to reset your stats.

    Btw, great series of posts. I really like what you’ve done in your blog. I hope the community recognizes the value of the insights here. If you need help let me know.

    • phantasmal says:

      A while back I came across the concept of Social Evaluative Threats, which are essentially instances of extreme anxiety driven by circumstances that trigger a peer evaluation. It’s definitely a real thing that plagues multiplayer gaming.

      I actually suspect that queue anxiety in Dota and similar games might be even worse than Starcraft because the social evaluation occurs as the game is progressing. It’s a relatively high information game, so if you die everyone knows about it (unlike something like TF2 or a WoW battleground where most people are unaware of literally everything that isn’t happening directly in front of them). It’s a small group game so it’s impossible to get lost in the crowd. It’s a game where an underperforming player can actively hinder their team, and what’s worse, becomes a game where much of the playerbase is searching for a scapegoat to blame their losses on. And it has relatively long rounds with leaver penalties, so you feel compelled to stick around in a poisonous environment for +20 minutes.

      I think you hear more about queue anxiety in Starcraft because it’s something that sets in late after the players are already established in the game and now have a social standing to live up to. A lot of the people who have queue anxiety in Dota simply quit very early or never start playing at all. I’ve even heard stories of people being unwilling to queue for beginner bots in LoL because they got yelled at their first time. And the fact that someone would yell at people over beginner bots is insane if you’ve ever tried them, but it’s still a thing that happens.

      It’s a pretty big problem with no simple solution, but it’s a good thing to talk about because it regularly goes unrecognized. Sometimes even trivialized.

  2. xdv says:

    Hello, I’m back after a long holiday. I love this post series: I too have argued long and hard for Valve to NOT include statistics and KDR in DOTA2 after seeing its effects in HoN and LoL.

    In fact, I’m not even sure if “becoming a better player” need be a factor at all. Out of a group of a 10 million players, there are always going to be 2 million players residing in the bottom 20% of skill, no matter what. The point of playing is to have fun: getting better at the game is pointless if you know that how well you play is relative to everyone around you – there will always be a bottom 20%.

    I’d almost be thinking of something like Riot does, just have convenient and effective player commendations after the match, which basically boils down to – do I like playing with this player, or do I not? Would playing with him again enhance my experience of the game or make it worse?

    Something like, would you like to play with this player again either on your team / opponent team?

    No – Too high skilled (receive enough reports and game promotes them to a higher league)
    No – Too low skilled (receive enough reports and game demotes them to a lower league)
    No – Unpleasant (receive enough reports and punishment ensues)
    Yes – skill level close to mine and friendly towards others (some kind of reward, badges, items, etc)

    With the extra restriction that you can only report your own members for “too high skill” and you can only report enemy players for “too low skill”, to avoid revenge reports. The game will encourage to to consider – if the carry on my team is stomping 30-0, would I like to be on the receiving end of that next game if we get matched on opposite teams? If there is a feeder on the enemy team going 0-15, even if I’m having fun now killing him, would I want him on my team next game?

    Ultimately the point of the whole social system is to make sure everyone you meet is in the 4th category so you can have an enjoyable game. Even if the primary way a person’s MMR moves is through win / loss and not through reports, the placebo effect of actually making a report (too high skill, etc) helps players feel like they are contributing something to improving the system, rather than feeling helplessly victimized by a faceless matchmaking system.

    • phantasmal says:

      Looks like Valve might be agreeing with you!

      I could have sworn Valve tried it’s commendation-esque system out before Riot’s, but maybe they were independent developments that just happened to come out around the same time. Either way, the real trick is that any system that relies on player ratings is going to have issues with players trying to game the system. Wouldn’t be surprised if there’s already a ton of research going on to get around this problem, but it’s a tricky one.

      The other concern for your system is that even if the high skill/low skill is a pure placebo, some players might get extremely paranoid that they’re receiving bad games because of a conspiracy to rate them as high/low skill. It might not be a large number of players, but if it happens they will be -loud-.

  3. paul k says:

    I feel that the community’s demand for visible MMR is not borne from a desire to accurately track improvement. If the immense success of World of Warcraft has taught us anything it’s that people will happily perform mindless tasks as long as their achievements are recorded in some form (experience, gold, titles etc.). If we’re being honest there are plenty of useful metrics for tracking improvement in D2’s current form, it’s just that very few of these are persistent. You have your win/loss and you have your replay “skill bracket”, the exact meaning of which is still unclear.

    So people want something that they can keep track of and “develop” so to speak. That’s something we’ve come to accept as a part of gaming in recent years. The problem arises when said metric actually changes how people play and experience the game. Anyone who transitioned from Warcraft III DotA to HoN can tell you how much their visible MMR interfered with the actual playing of a game. Before you had even loaded the lobby judgments were made based on your rating (not to mention KDR). It certainly increased the percentage of games which were simply “win at all costs”.

    I think that the player base still ought to have a metric that is persistent in this way for the sake of Dota 2 maintaining its relevance. I guess I can only hope that Valve implements its achievements sometime soon.

    • paul k says:

      I should clarify my first paragraph by adding that I don’t believe any of the metrics are great statistical resources, but I think they are just as meaningful (if not more so) than a KDR/visible MMR would be, and not nearly as toxic.

      I read this series of articles in a reverse order, and I believe your ideas regarding useful persistent statistics are really something to aspire to. My only question would be whether you think this is something the community wants (or are aware that they want), and if not, do you think Valve is the type of company to implement a feature that they believe to be positive despite their users wanting something else?

      • phantasmal says:

        When it comes to what the community wants in a multiplayer game, I’d distinguish between features and environments. If you took a requested features list, most of them would be in direct conflict with each other, and any random sampling would be unlikely to create an enjoyable gaming environment. If a particular version of a feature is poisonous to the environment, then the trick is to develop an alternate version that fulfills the demand in a different way that actually strengthens the community. For MMR and progression tracking, people latch on to just replicating HoN and LoL because it’s clearly something Valve could implement tomorrow if they wanted, and people are impatient. I’d like to believe Dota can do better than that, and it a better system would absolutely be worth the wait.

        If Portal is any indicator, Valve is the type of company willing to go to great lengths to research how people interact with what they’re creating. That gives me some hope because community development ought to be a iterative, research-driven process. It also should be close to, if not the number one priority, and I have more faith in Valve to give it that priority than many other major developers who shall remain nameless.

  4. dblu says:

    Visible MMR (or even Starcraft-esque brackets) could be useful and very relevant in the picking phase because you’re much more likely to win if your best player plays an impactful utility or carry hero instead of a hero that is less crucial for the team. You can see this in most mid to high level HoN games where people more often than not let the highest MMR person grab a carry or go mid.

    As a former HoN player (currently “very high” in DotA) I’ve also noticed how in HoN people care much more about laning than in DotA. Double melee lanes and throwing an unsuitable solo hero under the bus in the long lane are pretty rare in mid-high level HoN whereas in very high DotA this isn’t all that uncommon. Maybe a visible ranking would encourage people not to throw games away for no reason.

    It also feels aimless to play the game again and again when there’s no metric of your development, especially in the very high bracket when it feels like you’ve already “made it”.

    However KDA is a stat I would never ever like to see as it’s completely irrelevant and would have a bigger negative impact than MMR could ever have.

    Also I can agree on most points Maelk made about this.

    • phantasmal says:

      Visible MMR (or even Starcraft-esque brackets) could be useful and very relevant in the picking phase because you’re much more likely to win if your best player plays an impactful utility or carry hero instead of a hero that is less crucial for the team.

      Has this been proven or is it just a bit of unquestioned conventional wisdom? We could hypothetically (but not currently due to privacy controls) create a collection of public matchmaking games clustered around certain average MMR and controlled so that they are purely solo queue. We could then examine the correlation between each heroes win rate and what their player’s MMR was relative to the rest of the game. If we did this are you certain that a large segment of heroes would have noticeably higher and lower win rates when they were picked by the extreme ends of the MMR spectrum? If so, would those heroes match precisely the list of heroes you would expect to behave that way? Would the effect be constant at all levels of play? Would the effect diminish or disappear so long as all the players on the team are sufficiently close in MMR, and if so, what percent of games aren’t sufficiently close and are most of these because of group queuing?

      Maybe having visible MMR just encourages players to adopt a risk-adverse metagame based on perceived hierarchical differences, and looking briefly at the competition, perhaps this isn’t actually a good thing. Maybe it’s a better direction for Dota that public matchmaking is about learning and exploration more than it’s about raw, unbridled competition.

      And I can understand the feeling that public matchmaking eventually becomes unfulfilling, but is adding visible MMR the right answer to this or merely just the most low effort answer? Is it actually a reasonable expectation for there to be a competitive 5v5 game where you barely have to socialize? Dota is balanced around the assumption of teamwork, but a purely solo queue driven public MMR would disproportionately reward strategies that punish low teamwork environments far more than they’d reward strategies that thrive in high teamwork environments. And now that you have two distinctly different competitive outlets, organized 5v5 and solo queue, which one do you balance for when they inevitably disagree on the worth of a hero. It’s easy to say “Oh, ignore Ursa. He’s just a pubstar,” but maybe this opinion changes once there are ‘real’ points on the line.

      I’m in no way saying your concerns are irrelevant. They’re absolutely real, and they’re absolutely serious. But when something is bothering us we tend to immediately reach for the quickest apparent solution, and in this case I don’t feel that the quickest solution is the right solution.

      • dblu says:

        To be frank I don’t really care about how the lower brackets would deal with a visible rating, or how the “community” would develop. I just want to see competition and a thriving eSports scene. Besides, I don’t really think there are such enormous amounts of negatives to a visible rating system that it would matter to one way or the other that much.

        Obviously the issue is fairly complex and it’s hard to come up with any definitive answer. But personally I’d like to see some kind of a rating, whatever it may be. I feel like most experienced players share my opinion.

        However, one thing I’d definitely like to see implemented is pairing 5-man groups with 5-man groups exclusively, 4-man groups with 4-man groups exclusively and so forth. It’d make the matchmaking system much more fair.

      • phantasmal says:

        To be frank in the reverse direction, Valve does care about how the community develops, and these concerns are sometimes going to override yours.

        But to get back to the point, who said anything about ‘the lower brackets’? If we’re restricting ourselves to just the toxic nature of a visible MMR hierarchy, I’d say the real problem area is the cusp between High and Very High. It’s an area full of players who think they’re hot shit because they’ve managed to overcome the depressingly low standard of play necessary to make it to the top of high. Many of them are overly reliant on a handful of gimmicks and become irrationally outraged the moment someone picks something that doesn’t gel with their pathetically small repertoire. When those players ask for visible MMR for the picking phase it’s implicitly so they can have games where they can bully the other players on their team into picking their preferred team composition, which adds literally nothing to the competitive development of Dota as a whole.

        Now if what you sincerely want is a better competitive environment, go join the idxl or some other similar league. Help make it a better place, and post on the dev forums feature requests that’ll let these player run leagues be more directly integrated into the client itself. Fostering an actual ‘local’ scene with a familiar community will do so much more to improve the state of competitive play than encouraging a faceless player base to game some abstract number.

  5. dblu says:

    I’m forced (or bullied) to last-pick supports about every other game because another kind of hero would be nothing less of a suicide considering our existing lineup. It’s not like it has anything to do with MMR or even the skill bracket.

    I talked about the lower brackets because I think you don’t really need to nanny the very high bracket of the “monstrous” effects of a visible rating.

    I’ll look into idxl, thanks.

    • phantasmal says:

      Speaking as someone who has the almost pathological urge to pick into a good team comp, try to relax and just experiment. Play some RD, SD, or AR if you have to. I’ve mentioned it before, but one example that sticks in my mind was a game where Merlini was stuck with a pretty terrible all melee + 3 carry + double invis comp. He hovers over a support for a bit, and then locks Bounty Hunter. At lower levels of play he’d get raged at hard (I know from experience), but BH actually ended up being a great pick as he completely shut down the opposing Nature’s Prophet and then force fed the rest of his teammates track gold. So sometimes the best response to a really inflexible team comp isn’t to lock the hardest support to try to create some kind of balance, because let’s face it you can’t balance a team like that, and instead play around with those nebulous characters that hover in the grey areas between Carry-Semicarry-Support. It’s not something you’d do in a competitive game, but I believe what you learn from experiences like that will help make you a better competitive player.

      Too often players get to a point where they’ve started winning and suddenly think they know everything about the game. In reality, they do know more than the average player, but nobody knows everything about Dota. It’s an insane and beautiful mess of probabilities that I don’t think we’ll ever completely untangle. But the real trap here is that in getting to where they are they’ve learned to optimize their play, and they start thinking about optimal automatically. If you’re playing a good team, they’re not going to let you have optimal. Hell, often your own teammates won’t let you have optimal. To win consistently at every level of Dota, you need to learn how to think flexible. And honestly, matchmaking is a pretty good practice environment for this right now.

      The idxl is cool in concept, and I hope more competitors start to spring up. Part of my issue with visible ratings is that a visible rating matchmaking risks driving these player-run leagues into extinction. It’s hard to run a league, and you’re constantly dependent on maintaining a critical mass of players. Ranked solo matchmaking is a really vicious competitor, and I honestly believe it’d kill off most leagues before they even start.

      But for creating a competitive scene those leagues are way better than matchmaking could ever hope to be. You have a local populace who you get to know. You can discuss strategies and game replays together. You get to know these people. What time zone they play in. Whether you can stand to be around them. This all-important level of familiarity that fosters the creation of more 5-man squads.

      And eventually maybe these mini-leagues start to develop their own following and their own streams. I’d liken them to college football conferences that develop loyal fanbases and rivalries and then eventually the best of these mini-leagues compete in some analogue of the bowl series, and the winners try to graduate into the full on pro scene. I feel as an evolutionary path for the competitive scene that this would be way more productive than just slapping on a rating to public matchmaking and calling it a day. Because the biggest hurdle to making a successful 5-man squad isn’t the mechanical aspect or even the knowledge aspect. It’s the social element.

      • Throwback says:

        Great response. I love Valve’s approach towards the community and I love your analysis of it.

        By Valve letting stats stay private, they are in fact helping the overall system. The beauty is that the part of the community which wants ‘more’ (ironically while still being unwilling to form a team and genuinely compete) has started to fill in those gaps – and Valve’s platform makes that incredibly easy to do.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: