An analytical attempt to bring science to the art of selection – Can science come close to art?

Performance analysis is the major tool in measuring performance of football players. It is done in the form of rating system. Football players are rated on the basis of how they perform on the pitch. Media Data companies like OPTA gives football players performance rating based on the 3000 real time data metrics; algorithms are made based on these 3000 metrics that captures every movement of the player and gives them rating. Each metric is combined with proper weights based on the players performance and then they are rated. These players rating help the major stakeholders in sports industry to take decision on players.

FIFA publishes current FIFA player ratings; they provide ratings that somewhat represents a player and teams’ strength. Football is a complex sport; players have varied roles like goalkeepers, defenders, midfielders and forwards (strikers) and each role has a different set of skills. Each group is sub grouped. For example: midfielders are divided into defensive midfielders, center midfielders and attacking midfielders. But a team will make the players play depending on the strategies made by the managers and coaches and also by assuming opponent’s strategies. The role of football players can also be changed during the course of the game. For example: if a team is losing they can change the role defensive player into an attacking one. The rating of football players is very complex but by depending on these ratings a club can improve its teams’ strengths by adding high rated players to the team and by adding undervalued players in the transfer market.

The objective of this article is to come up with a new formula to understand how a football player is rated based on 31 important metrics that affect the game.

The probable 36 men England World Cup 2018 squad is to be prepared by using the new rating module and to check the maximum overlap with the actual selection.

Data has been collected from whoscored.com; these data and metrics are provided by OPTA. In this website 31 metrics are given that mostly affect the game, and there is a match center tab that shows the live players rating, the heat map of the players movements on the pitch and change in ratings depending upon the players performance.

OPTA*- Opta are the Official Media Data Partners to Football DataCo. Opta collects the most complete dataset, live.

31 metrics*- minutes played, shots, shots on target, aerials won, touches, unsuccessful touches, dribbles, fouled, off-sides, dispossessed, total tackles, interception, clearances, blocked shots, fouls, key passes, passes, passing accuracy percentage, crosses, accurate crosses, long balls, accurate long balls, through balls, accurate through balls, points, assists, goals, clean-sheet, red card, yellow cards, rating provided by OPTA. (https://www.whoscored.com/Matches/1190479/MatchReport/England-Premier-League-2017-2018-Manchester-United-Watford).

Capture

Data Analysis Techniques: 

The data from the website are then transferred to excel sheets and is organized in a proper way. In the whoscored website we can observe that there are four tabs, viz. summary, offensive, defensive, passing.

Sample _1

The data is then moved to SPSS (software platform that offers advanced statistical analysis). This data are then run through four different types of regression i.e. enter, stepwise, backward and forward. The output of backward regression is taken into consideration as the metrics in this regression has lower significant value. The metrics with lower significant value gives the proper and valuable weightage to these metrics which eventually help in creating the new formula.

Capture 5

From the above output the new formula that is being derived is

0.4%*MP+0.5%*S+4.1%*SOT+3.9%*DR+1.1%*FO+3.5%*OFF+4.5%*AW+0.2%*T+2.1%*KP+10.6%*A+18.2%*G+9.1%*CS-0.7%*P+1.5%*PA%-0.4%*CR+3.2%*ACR+4.1%*LB-4.2%*ACLB+4.7%*TB+2.8%*TT+1.7%*I-2.4%*YC-9.9%*RC+6.6%*PO

With the help of this new rating system the probable 36 men England World Cup 2018 squad is prepared.

Capturedfghjk

Insights discovered:

According to the algorithm when a player enters the field he enters with the base rating of 6 and his ratings increases or decreases according to the performance on the pitch.

As it can be observed from a live match and even from the ratings that are provided by any media data company and also even from the new rating module that minutes played is an important factor.

Those players who have played maximum numbers of minutes have impacted the game more than the players with fewer minutes.

Key drivers of ranking:

The key drivers of raking or rating are the minutes played, assists, goals, clean-sheets, red card and points.

Even though the weightage of the metric minutes played is very low but it makes a great impact when the player has played for less minutes but has scored a goal or gave an assist.

Minutes played because it positively impacts the game and also a player performance, as a player will be able to perform for more number of minutes and help the team in achieving.

Assists because they lead to goals and helps in increasing the players passing ability and vision to see the pass.

Goals help in determining the players ability to convert a shot and its keeps the team on a safe side and goals help in winning matches.

Key players missed out, why analytics missed these players:

Some of the key players and usual name didn’t come up in the list when the players are rated with the new rating formula. The number of competition for top five teams is more than that of the clubs ranked from the sixth position. The key players that didn’t make it to the list is because their team had more options in the squad and they rest their key players in premier league as the intensity of the game is high and risk of injury is also high, so these players played less number of minutes which effected the rating of the key players.

The new rating formula has been developed on the basis of 380 matches data of premier league 2017/2018. This analysis and the formula can be improved with more than 380 matches data. If the data of all the major league are available then these rating from the new formula could be more accurate than what it already is.

 

Leave a Reply

Your email address will not be published. Required fields are marked *