ThermoHuman

Scientific articles

Sport

Machine Learning Application in Sports: An Injury Forecasting in Soccer

Victor Escamilla

1/20/2022

Scientific articles

Sport

1/20/2022

Machine Learning Application in Sports: An Injury Forecasting in Soccer

We start the year with one of the first articles published in 2022, which analyzes the role of infrared thermography in the prevention of injuries in sports. The group of Alessio Rossi and collaborators (2021) have previously published very interesting articles on data analysis with the aim of forecasting injury events for injury prevention through data obtained from GPS (A. Rossi, et al. 2018).

In this case, the researchers conducted a narrative review with the aim of providing a guide that describes a correct approach to train, validate and test machine learning models to predict events in sports science.

The authors highlighted the importance of these prediction models due to the economic savings they entail. The great impact that injuries have on the sports industry has been widely studied, affecting both the performance of the team and the economic status of the club. Recent data shows that an English Premier League team loses around £45m a year due to their team's injury-related performance decline each season (Eliakim, E. et al 2020).

Previous prediction models in the literature

During the last decade, many investigations proposed models in order to evaluate the athletes' injury risk index. The first injury risk model was proposed by Gabbett and colleagues in 2010 (Gabbett, T. et al. 2010), where they collected a unifactorial approach to the identification of the risk of soft tissue injury, more specifically, the muscle. The risk was estimated through the evaluation of the training load received by the player during the competitive season. For example, the perception of effort perceived by the player during training and competition sessions.

These studies were developed to find a relationship between the acute workload (understood as the average training load that the player had endured the previous week) and the probability of risk of soft tissue injury (Hulin, B. et al. 2014; Gabbett, T. et al. 2016).

Essentially, these studies suggested that large increases in acute load relative to chronic load (understood as the average training load of the previous four weeks) were associated with injury increases. In particular, the authors showed that players with an acute versus chronic workload ratio (acute:chronic workload ratio, ACWR) above 1.6:1, are more likely to be injured than those with the ratio in 1:1. In other words, a weekly load that exceeds the efforts made by the players in the previous four weeks by 160% implies a higher risk of injury than if it is maintained at 100% in the previous weeks. These ACWR values showed better insights into injury risk than total absolute training load.

On the other hand, in a review (Rossi, A. et al. 2018), the accuracy of the ACWR ratio prediction was tested. The results of the study are in accordance with the critique to the efficacy of other authors such as Impellizzeri, F et al. (2020) that show poor predictive performance when applied to real scenarios. In view of the results, and since the health of the players is affected by multiple factors related to the responses to the training stimulus, it seems that the simplification of one variable does not give a global perspective of the situation that is specific enough to take predictive decisions.

Guide for the generation of predictive equations of injury events

For this reason, the authors of this review establish multiple sources of information collected in a machine learning screening process, so that an artificial intelligence is capable of evaluating the factors that make it possible to predict injuries. As shown in Figure 1, the psychophysiological factors, the training load, injury registration and mathematical models of statistical analysis are taken into account to establish injury prediction equations.

Figure 1. Construction of a guide for an injury forecasting model. Adapted from Rossi et al. (2022)

The variables placed in figure 1 as input values represent data that offer characteristics of the athlete's condition. Besides, the label values represent the dependent variables for the registration of injuries with which a forecasting model can be created.

Thermography as an input variable in forecasting models

Among the input variables, thermography has demonstrated its validity for injury prevention thanks to its ability to evaluate the physiological state of the tissues of the human body and the state of the athlete's fatigue (Fernandez-Cuevas, I et al 2017). After exertion, the athlete experiences a change in blood perfusion that affects skin temperature. These changes can be included within the two large groups of variables (psychophysiological evaluations and training load), since its metrics offer multiple possibilities of analysis.

In figure 2, we can see that while the thermal asymmetry metric evaluates the state of the subject from an initial moment, detecting potential physiological indices of risk of injury due to overuse (Gomez-Carmona et al . 2021), the coefficient of variation allow monitoring the global temperature variation and evaluating the degree of fatigue through the body temperature to which the player is being subjected (Thorpe et al. 2021).

Figure 2. ThermoHuman © software main metrics

In addition, the authors highlight, among other benefits of thermography, its non-invasive use, its speed in taking images and its daily application to evaluate the physiological state of the athletes.

Conclusions

The authors provide a guide that allows the correct construction and evaluation of sports injury forecasting models. Within it, some aspects stand out: how to select the models to process the data; how to train, validate and test the predictive models correctly; how to extract information from the machine learning models; and how to evaluate the correct prediction of the model.