How can I compare players across competitions and countries?

How can I compare players across competitions and countries?
Ferdinand Schlatt

Comparing players across competitions, countries, and especially continents is a difficult problem. How do we at Matchmetrics attempt to solve this issue? TLDR: we use the rating difference after player transfers together with a unified embedding for all competitions to obtain a global league strength index. Feel free to continue reading if you want more details.

 

What is the problem?

Everybody in the football world has asked themselves this question: there is a young upstart talent who is playing well in a youth league or second team of a major club, however, will he be able to play at the same level in adult or upper leagues? What about a star player in a minor European league. Does he have the skill to also play in the top leagues? Finally, what if a minor team wants to expand its scouting efforts and look for players from different continents?

Essentially the problem boils down to this: it is only possible to gauge a player’s skill based on how he performs in the league he is currently playing in. The only way to predict a player’s performance in another league is to make an educated guess of the difference in strength between the two leagues. Instead of trying to guess the performance difference, let us try to find a more substantiated and data-driven approach.

 

How are we trying to solve it?

Alright, so our goal is to find a way to measure the performance difference between leagues. What kind of data do we have at our disposal? Through Scoutpanel, we have access to a large amount of event-based rating data for over 160,000 players across more than 550 competitions and multiple continents. Especially interesting are players who have performed in several competitions or transferred into a different league. If we take a look at the rating differences of players that resulted from moving from one league into another, we can directly see the difference in skill level between the leagues. Extremely simplified: if the player improves his rating, we can assume his new league is worse and vice versa.

Of course, a player’s performance is extremely complex and linked to an uncountable number of factors. However, if we consider the difference over several transfers, we can average out these factors and obtain a general trend in skill difference. Let’s look at an example: Between 2013 and 2019, 514 players transferred from the English Premier League (EPL) to the Football League Championship (FLC). On average, the players featured a rating gain of about 0.85. In the same period, 459 players transferred from the FLC to the EPL and lost around -0.73 in rating on average. We can, therefore, assume the EPL is roughly 0.8 rating points stronger than the FLC.

 

How do we obtain a global league index?

Great, so we can compare competitions based on past transfers. But what about comparing the skill level between two competitions without any transfers? For example, it is fair to assume that a transfer from the Brazilian Serie D into the Spanish Primera Division is fairly unlikely. However, by chaining multiple leagues together, we can arrive at an approximate skill level difference. Transfers between the Serie D and C are common. From the Serie C, we can jump to the Serie B and then to the Serie A and finally from the Serie A to the Primera Division.

The final problem we need to solve is the extensive interdependence between leagues. The skill level of the English Premier League does not only depend on the difference to the Football League Championship but also is connected to the strength of the French Ligue 1, the German Bundesliga, the UEFA Champions League and so on and so forth.

In a bit more graphical words: all global football competitions create a giant web, where each competition is connected by a strand of transfers. What we can now do is attempt to stretch these strands, such that for each competition, its strands are the exact length of the rating difference of the transfers. For example, we need to place the EPL and the FLC, such that they are both roughly 0.8 apart from one another. In practice, it won’t be possible to get the distances exactly right for every league. We need to suffice with getting the web that best approximates the global network of rating differences.

 

Result

Alright, let us compute the web for a total of nearly 1.5 million „transfers“ in our database. Here, a transfer means any change in competition within a season or across two seasons. For example, a player plays in the German Bundesliga and the UEFA Champions League in both the 2018 and 2019 season. That means a total of 4 transfers: BL’18 -> CL’18, CL’18 -> BL’18, BL’18 -> CL’19 and CL’18 -> BL’19.

And this is what it looks like:

Competition level by matchmetrics

This is a cutout of the top European leagues. The stability is shown on the x-axis and rating on the y-axis. At a first glance, the distribution matches the general intuition quite well. The English Premier League and Spanish Primera Division are on top in rating and stability, with the other major European leagues following close behind.

Taking a closer look we can gain additional interesting insights. For example, the international and national cup competitions feature an intuitively correct rating but don’t line up with the league competitions regarding stability. The UEFA Europa League has a rating somewhere between the top 5 European leagues and the minor leagues like the Swedish Allsvenskan and the Dutch Eredivisie. So far so good, as the competition is made up of teams from all of these leagues. However, its stability is a lot lower. From this, we can conclude that clubs perform less consistently in the international or domestic cups, compared to regular league play.

 

How does this help us?

So what can we now do with this information? The current Scoutpanel ratings always need to be considered in the context of the league. There currently is no easy way to compare players across leagues. We can now use the league strength information to estimate how well a player will perform in a different league. However, as already mentioned, how a player will perform in a different league depends on so many factors, that simply subtracting the difference in league strength will rarely give a good estimate. As a next step, we are trying to collect a lot of these factors and combine them with the league strength into a single prediction model. We can then use this model to estimate the rating of a player, irrespective of which league he is transferring to. Always wanted to know how Messi would perform in the Latvian Virsliga? Stay tuned for more!