Simulating skill in the Premier League: part1

I love sports. Every week I watch Tottenham play, and just as regularly I go through the emotional roller-coaster that entails. As a sports fan I use the first person plural to describe 'my' team, and am convinced we need to keep Gareth Bale, just as I hold the conviction that Tottenham will bottle it come May. But at the same time, I get frustrated when pundits bring out stats along the lines of: 'Wigan haven't lost in their last 5 trips to X' (OK, that would be quite a good stat), or Liverpool haven't won without Suarez since February, or 'player X has scored in five games in a row, wow is he on fire'. I am convinced sports pundits highlight 'trends' that are actually almost all random. Of course skill plays a big part in sport, but when everyone has skill, how important is chance? Below I demonstrate that doubling skill levels, rather than guaranteeing victory instead results only in a 50% increase in points gained.

Sparked off by a conversation about how important marginal gains due to statistics might be for sport, I decided to actually check.

 I wrote a basic football simulator, the goal being to see how much variation in results is due to chance, and how much due to skill, data analysis, tactics, having the best players, etc. etc. I wrote a simple simulator that plays every game for a season and stores the results. The only rules are that teams may only score once every 5 minutes, and that they have a fixed chance of doing so. Here is one example of a match, between Tottenham and Arsenal, with Tottenham having a 5% chance of scoring, and Arsenal a 4% chance (more accurately: 95 and 96% chance of not scoring)

I wrote the code in Python, and am happy to share, but for now I've struggled to make iPython play nice with Blogger, so I'm pasting images of my code instead:















In this case, as sadly so often in real life, Arsenal won. And not just by a small margin, but 3-0. Pundits would have had a field day, identifying why Arsenal were so much better, what tactical decisions were crucial, etc etc. Except in this case, any patterns identified would be nothing but coincidence. The result is even against the odds - Tottenham had a 20% skill advantage but still failed to capitalize. The reason for this is that in football the odds of scoring are small, consequently a lot of results are determined by luck. However, you might say, 'I bet this guy just ran the simulator loads of times until he got the result he wanted. These things even out over the course of the season'. And you'd be right about the former, but not the latter. So to prove my point holds, I wrote a mini league with just Tottenham and Arsenal where each season involves only two games. I ran this simulation 100 times to see how big an effect the 20% advantage would yield. What do you think I found?
Virtually no effect. We might expect an average of 3 points over two games, but that would ignore the option of a draw. Over 10000 trials in a fair simulation (both teams had 4% chance of scoring), the average points total was 2.66. The first time I ran the simulation, Tottenham picked up on average 2.7 points, that is, despite a 20% performance bonus they only got a 2% increase in results. But, 100 is a small sample in statistical terms (though much larger than the number of local derbies in any footballer's career!), so I re-ran the simulation 10 000 times, to get the actual points benefit, and included a histogram to show the variation in points won per season:


This gives a truer picture: now Tottenham won close to three points over two games on average, corresponding to an 11.2% increase in results. The crucial point here is: the randomness inherent in football's rules as a low scoring game outweighs any small performance bonus. Now, just to demonstrate that the code is not wrong, here's what happens when Tottenham have a ten percent chance of scoring, 2.5 x Arsenal's:
So here the large skill diference translates into a noticeable, but not unbelievable benefit over 10 000 games: on average Tottenham won 4.3 points out of 6, or 61% more than we might expect. The same pattern that we saw above holds: for any percentage gain in results, twice the percentage increase in skill is needed. In the first simulation a 20% increase in skill resulted in a 10% gain in results, while in the second a 120% skill bonus resulted in a 61% improvement in results. This relationship is almost linear: as a team approaches 5x higher skill level, victory is virtually guaranteed. This model is effectively one with a tipping point, where victory goes from being in large part due to luck, to becoming inevitable.

What is going on on here? Well, scoring a goal is improbable. If we change the rules to allow for tennis scores, the results would be much starker. The upshot of this is that to guarantee success in football, you have to be multiple times better than the opposition, not just marginally better. Doubling performance will increase victory likelihood by 50%.  Intuitively this makes some sense: upsets in Football are much more common (think of any World Cup) than in sports such as tennis, where the game structure is such that marginal skill differentials leads to predictable results. Because in reality skill differences at the elite level are small, luck will always play a considerable role. OK, let's not get carried away here: this simulation is beyond simplistic, arguably scoring a goal is about a chain of events all coming off, all of which require skill. Also an infinite number of factors could be added: home advantage, red cards, star players, etc. etc. The point I hope I made, though, is that while people focus tremendously on those variables, the variable of chance tends to be ignored. I often hear commentators term near-misses 'unlucky'. Very rarely do they say the opposite about the screamer from 30 yards that finds the top corner. 

In part2 I explore a somewhat more complex model and simulate a whole season with the right number of teams, and explore how much skill is needed to give a believable premier league table, and how much better the best teams have to be to ensure a consistently high position. 


9 comments:

  1. Risk of online casinos are realistic.
    บาคาร่าออนไลน์ Online gambling sites where users have the opportunity to gamble online. Many online gambling games are available that allow players to gamble at all times. Online casino games are fun to gamble, which makes customers feel like they have a chance to win at the real casino. Online Gambling Games Online Gambling Games Online gambling games from all the Poipet Casino. Which is the venue of the famous casino games in Cambodia. Online Betting Games That Provide Full Enjoyment And A Favorable Betting Environment Ready to serve the full glee to the players on the phone today. If the user wants to gamble on online games that will keep customers from missing games that players like and update new. We are the channel of gambling that the new generation of players. You do not have to download it to play on your mobile. No matter which online gambling games are offered, it will keep the customers entertained. Make sure you are happy with your online gambling bets. รูบี้888

    ReplyDelete
  2. This is my first time i visit here. I found so many entertaining stuff in your blog, especially its discussion.https://mtscr.com

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Thanks for sharing this Blog with us your blog is very interesting if want
    more information about topic Visit Us our Website at movies flame

    ReplyDelete
  5. Your home away from home! hotels near me, friendly staff, and a peaceful atmosphere. Perfect guest house near me.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. construction companies in pakistan is sure to impress with its expertise and professionalism.

    ReplyDelete
  8. diving into Simulating Skill in the Premier League: Part 1,AmbitionHost provides reliable hosting to ensure your sports simulations and analyses are always accessible and engaging. Anyone else using for sports and gaming content

    ReplyDelete