Machine Learning Application in Sports: An Injury Forecasting in Soccer

Machine Learning Application in Sports: An Injury Forecasting in Soccer

20/01/2022 By: Víctor Escamilla & Ismael Fernández Home

We start the year with one of the first articles published in 2022, which analyzes the role of infrared thermography in the prevention of injuries in sports. The group of Alessio Rossi and collaborators (2021) have previously published very interesting articles on data analysis with the aim of forecasting injury events for injury prevention through data obtained from GPS (A. Rossi, et al. 2018).

In this case, the researchers conducted a narrative review with the aim of providing a guide that describes a correct approach to train, validate and test machine learning models to predict events in sports science.

The authors highlighted the importance of these prediction models due to the economic savings they entail. The great impact that injuries have on the sports industry has been widely studied, affecting both the performance of the team and the economic status of the club. Recent data shows that an English Premier League team loses around £45m a year due to their team’s injury-related performance decline each season (Eliakim, E. et al 2020).

Previous prediction models in the literature

During the last decade, many investigations proposed models in order to evaluate the athletes’ injury risk index. The first injury risk model was proposed by Gabbett and colleagues in 2010 (Gabbett, T. et al. 2010), where they collected a unifactorial approach to the identification of the risk of soft tissue injury, more specifically, the muscle. The risk was estimated through the evaluation of the training load received by the player during the competitive season. For example, the perception of effort perceived by the player during training and competition sessions.

These studies were developed to find a relationship between the acute workload (understood as the average training load that the player had endured the previous week) and the probability of risk of soft tissue injury (Hulin, B. et al. 2014; Gabbett, T. et al. 2016).

Essentially, these studies suggested that large increases in acute load relative to chronic load (understood as the average training load of the previous four weeks) were associated with injury increases. In particular, the authors showed that players with an acute versus chronic workload ratio (acute:chronic workload ratio, ACWR) above 1.6:1, are more likely to be injured than those with the ratio in 1:1. In other words, a weekly load that exceeds the efforts made by the players in the previous four weeks by 160% implies a higher risk of injury than if it is maintained at 100% in the previous weeks. These ACWR values ​​showed better insights into injury risk than total absolute training load.

On the other hand, in a review (Rossi, A. et al. 2018), the accuracy of the ACWR ratio prediction was tested. The results of the study are in accordance with the critique to the efficacy of other authors such as Impellizzeri, F et al. (2020) that show poor predictive performance when applied to real scenarios. In view of the results, and since the health of the players is affected by multiple factors related to the responses to the training stimulus, it seems that the simplification of one variable does not give a global perspective of the situation that is specific enough to take predictive decisions.

Guide for the generation of predictive equations of injury events

For this reason, the authors of this review establish multiple sources of information collected in a machine learning screening process, so that an artificial intelligence is capable of evaluating the factors that make it possible to predict injuries. As shown in Figure 1, the psychophysiological factors, the training load, injury registration and mathematical models of statistical analysis are taken into account to establish injury prediction equations.

Figure 1. Construction of a guide for an injury forecasting model. Adapted from Rossi et al. (2022)

The variables placed in figure 1 as input values ​​represent data that offer characteristics of the athlete’s condition. Besides, the label values ​​represent the dependent variables for the registration of injuries with which a forecasting model can be created.

Thermography as an input variable in forecasting models

Among the input variables, thermography has demonstrated its validity for injury prevention thanks to its ability to evaluate the physiological state of the tissues of the human body and the state of the athlete’s fatigue (Fernandez-Cuevas, I et al 2017). After exertion, the athlete experiences a change in blood perfusion that affects skin temperature. These changes can be included within the two large groups of variables (psychophysiological evaluations and training load), since its metrics offer multiple possibilities of analysis.

In figure 2, we can see that while the thermal asymmetry metric evaluates the state of the subject from an initial moment, detecting potential physiological indices of risk of injury due to overuse (Gomez-Carmona et al . 2021), the coefficient of variation allow monitoring the global temperature variation and evaluating the degree of fatigue through the body temperature to which the player is being subjected (Thorpe et al. 2021).

Figure 2. ThermoHuman © software main metrics

In addition, the authors highlight, among other benefits of thermography, its non-invasive use, its speed in taking images and its daily application to evaluate the physiological state of the athletes.


The authors provide a guide that allows the correct construction and evaluation of sports injury forecasting models. Within it, some aspects stand out: how to select the models to process the data; how to train, validate and test the predictive models correctly; how to extract information from the machine learning models; and how to evaluate the correct prediction of the model.

If you want to know more


Rossi, A., Pappalardo, L., & Cintia, P. (2022). A Narrative Review for a Machine Learning Application in Sports: An Example Based on Injury Forecasting in Soccer. Sports10(1), 5.

Rossi, A.; Pappalardo, L.; Cintia, P.; Iaia, F.; Fernàndez, J.; Mediana, D. Effective injury forecasting in soccer with GPS training data and machine learning. PLoS ONE 2018

Eliakim E, Morgulev E, Lidor R, et al. Estimation of injury costs: financial damage of English Premier League teams’ underachievement due to injuries. BMJ Open Sport & Exercise Medicine 2020

Gabbet, T. The development and application of an injury prediction model for noncontact, soft-tissue injuries in elite collision sport athletes. J. Strength Cond. Res. 2010

Hulin, B.; Gabbet, T.; Blank, P.; Chapman, P.; Bailey, D.; Orchard, J. Spikes in acute workload are associated with increased injury risk in elite cricket fast bowlers. Br. J. Sports Med. 2014,

Gabbett, T. The training-injury prevention paradox: Should athletes be training smarter and harder? Br. J. Sports Med. 2016, 50, 273–280

Impellizzeri, F.; Matthew, S.; Kempton, T.; Novak, A.; Coutts, A. Acute: Chronic Workload Ratio: Conceptual Issues and Fundamental Pitfalls. Int. J. Sports Physiol. Perform. 2020

Fernández-Cuevas, I.; Arnáiz Lastras, J.; Escamilla Galindo, V.; Gómez Carmona, P. Thermography for the Detection of Injury in Sports Medicine. In Application of Infrared Thermography in Sports Science; Priego Quesada, J., Ed.; Biological and Medical Physics Biomedical Engineering; Springer: New York, NY, USA, 2017.

Gómez-Carmona, P., Fernández-Cuevas, I., Sillero-Quintana, M., Arnaiz-Lastras, J., & Navandar, A. (2020). Infrared thermography protocol on reducing the incidence of soccer injuries. Journal of sport rehabilitation, 29(8), 1222-1227.

Thorpe, R. T. (2021). Post-exercise recovery: Cooling and heating, a periodized approach. Frontiers in Sports and Active Living, 236.

If you have any questions or would like to give us any feedback, do not hesitate to write to us. We will be glad to read you.

Europa Thermohuman ThermoHuman has had the support of the Funds of the European Union and the Community of Madrid through the Operational Programme on Youth Employment. Likewise, ThermoHuman within the framework of the Export Initiation Program of ICEX NEXT, had the support of ICEX and the co-financing of the European Regional Development Fund (ERDF).

CDTI Thermohuman has received funding from the Centre for the Development of Industrial Technology (CDTI), in participation with the European Regional Development Fund (ERDF), for the R+D activities involved in creating a new tool, based on thermography, for the prediction and prevention of rheumatoid arthritis. See project detail.