I enjoyed the data mining, but of the 191 games, how many players were responsible for that? And of that, how many times did those players play as what faction?
Just curious to see how perceived skill levels go into the win/loss stats. Meaning if the "pro" players played allies more often versus the "noob" players, the side winning stats will skew that way.
In total more than 100 players played, but only 27 contributed to more than 4 games. Top players in terms of games are
22 | RealG DevM |
20 | Theodosios |
16 | JesuliN |
12 | cruzz |
11 | VonIvan |
11 | -HOI-PauL.a.D |
10 | SoE-Overlord- |
10 | Brosras {C} |
10 | Sotjador |
10 | wmm//wuff |
Good point, though, if for some reason the "pro" would always pick OH while the "noob" would always pick OKW, obviously the stats would indicate a much better win percentage for OH and a very poor for OKW. That said, I doubt this has a significant impact, because why would that happen? If you check out
the player based stats for ESL you'll find that even for top players personal preference seems to play a really important role in terms of faction choice.
Fact is, there is a significant skill gap in most matchups. And not just between the top 30 players and some rank 400 guy but even within the top 30. So, you'd have to devise some really vague criterion that defines skill and then end up with only a handful of games between players of the same skill, knowing that you probably still haven't eliminated the issue completely, because, hey, what if that player just had a really bad day? I wouldn't know how to take this into account properly.
What if DevM (playing exclusively OH and USF) wouldn't have participated? Well, ok, probably USF and OH win rate would be lower. But even if you check out his personal stats, the trend is similar to what we see in the total data: He won all his USF matches, but lost 4 games as OH. Or Jesulin: He played 4 games as SOV vs. OH, winning 50%. He played OH only once, losing to USF. He won all 5 games as USF. He won 3 out of 6 games as OKW.
TLDR: I agree, the statistics are to some extent subject to which players played and so on, so they first and foremost reflect what happened during WPC and you have to be somewhat careful with generalizing them. And I wouldn't really know how to cover this properly...
This is kind of similar to my regular job: I get data I know is inaccurate and to some extent inconclusive. And then I have to derive models from that data. While that might seem weird, the thing is if you don't use that less then optimal data, you have no model. So, this is ok as long as you know what the limitations and shortcomings are