Podcast 14: Alessio Rossi, analista de datos de deporte (Universidad Pisa)
? En el decimocuarto episodio del podcast de ThermoHuman: “Thermography: Science, Health and Sport” (podcast en inglés) presentamos a Alessio Rossi, científico del deporte especialista en modelos de prevención de lesiones, Italia.
? El grupo de Alessio Rossi (Científico del deporte especialista en prevención) y colaboradores han publicado anteriormente artículos muy interesantes sobre análisis de datos con el objetivo de pronosticar eventos de lesiones para la prevención de lesiones a través de datos obtenidos de GPS (A. Rossi, et al. 2018).
⚙️ En la construcción del modelo de pronóstico de lesiones, los autores de este artículo establecen múltiples fuentes de información recopilada en un proceso de cribado de aprendizaje automático, con el fin de establecer un modelo deportivo.
? Una de ellas, puede ser la termografía, que ha demostrado su validez para la prevención de lesiones gracias a su capacidad para evaluar el estado de fatiga del deportista (Fernandez-Cuevas, I et al 2017).
? Hablamos con él para profundizar en la aplicación de la termografía para la prevención de lesiones.
«Entonces, creo que podría ser una información muy interesante (la termografía). Creo que, probablemente como un nuevo dato, así como la Datos GPS…»Alessio Rossi
? Alessio Rossi nos ofrece un interesante testimonio durante nuestro podcast sobre la aplicación de la termografía en los nuevos modelos de prevención de lesiones. Alessio Rossi nos trae una perspectiva experta sobre la termografía en su aplicación en el machine learning y en el proceso de toma de decisiones para el manejo de las cargas de trabajo en el alto rendimiento.
? Por supuesto, uno de las declaraciones más interesantes de Alessio Rossi habla de como se incluiría la termografía como una nueva variable en los procesos de calculo de ecuaciones que ayuden a la prevención de lesiones. Igualándola, a las métricas de los GPS o de las escalas de esfuerzo percibido.
«So, I think that it could be a very interesting information (thermography). I think, probably as a new data, as well as the GPS data…»Alessio Rossi
TRANSCRIPCIÓN (en inglés)
I: Hello everyone, thank you very much for listening to this small podcast focusing on new articles and new research on sports, and I’m very pleased to welcome Alessio Rossi, thank you, buon giorno and good morning Alessio.
A: Thank you, Ismael.
I: A pleasure to have you here and let me introduce Alessio. Alessio is a sports scientist expert on big data, we are pleased to have you here because you have recently published a very interesting article about injury forecasting in soccer and obviously before that, he works for the University in Pisa in Italy on different projects, mainly one of the horizons 2020 focus also in big data and he has not just recently published this article, but previously published also other works very interested focusing on big data in soccer and mainly on the capacity of the different technologies and data that we can gather, like for example GPS and other technologies for predicting injuries, which is one of the main problems so, Alessio, thank you very much for being here.
A: Thank you, Ismael, for your invitation, it’s a pleasure for me to speak in your podcast about my work and your interest in my work so, thank you, Ismael.
I: You are welcome, you are welcome Alessio so, it will be very, let’s going through the topic but my first question is quite simple, could you please sum up your last work?
SUMMARY OF HIS LAST WORK
A: Yes. His last work is focused on a provide some examples and provide the limitations and the strengths of applying machine learning or sports. We detected looking at the literature there is some mistake, some confusion about applying machine learning to sports and so, we decided to provide this handbook, this narrative review to permit all their searchers in sports to apply machine learning correctly and in this case, we provide an example about injury prediction because it’s one of them our main topic and so we decided to describe every step during the application of machine learning so, starting from the using the data which kind of data we can use, how to preprocessing this data so, in order to clean data, to create some historical time series and provided also the limitation and the strength of using these reprocessing approach. Then, we describe the models that are previously used on this topic and also the models, one main problem is that no one or few research compare their results with other algorithms provided in the literature, and also a baseline model. A baseline model is extremely important in machine learning because we create a fake algorithm that provides, for example, a dummy classifier, very easy rules because if we can have a better performance, then these easy rules, we have learned from the data, otherwise we can detect anything. We can have a good precision, a good accuracy of our algorithm, but the same accuracy is detected by using very easy rules, for example, provide every time the same prediction or provided the prediction in accordance with the distribution of the examples, and for this reason, it gets extremely important to understand if really our models learn from data. Then, we describe the process of how we can validate our process. So, machine learning starts from training our model in a part of the data set and then testing it in a different part of the data set because we can suffer if you use the same data set for training and test, we can suffer off overfitting problem and we can detect a very good prediction but it’s not real because we test our algorithm in the same example that the algorithm had already seen, and then, finally we describe other processes to improve the ability of the machine learning to learn from the data, for example, the oversampling approach, if the set is unbalanced, the feature selection approach, defeating of hyperparameters from the algorithms and they’re permitting to increase the prediction ability, and finally, the way to validate these data, so the metrics that we can use to understand if our models are good to predict this algorithm. The very last important thing that we describe is that our rhythm is important that we can explain the decision making from the algorithms, so we cannot say for a sports scientist that, the players will get injuries the next few days so, but it’s important to say, why it will happen? so creating an interpretable model even if it’s probably lower accurate is better in our field, compared to having a black box that predicts correctly every injury because we have to say something more to athletic trainers, coaches… about the status of the players, to modify the training workload, the training schedule, to reduce this risk of injuries in accordance with this prediction and maximize the effect of the training.
I: Very, very interesting and I know how hard it´s, to sum up, a work of probably months or even years, in a couple of minutes, but thank you very much, Alessio. I have other questions obviously because this kind of paper, this kind of works is just bringing more and more insights into a hot topic as machine learning, right? but my question is, sports scientists in general, but professionals in sports and mainly in high performance, we are speaking now about football, soccer, right? obviously facing the problem and the challenge of injuries and one of the main rules that has been obviously on the top of our minds has been managing the loads, so based on the publication of Gabbet some years ago, that was obviously giving a lot of importance to the role of managing mainly the workload now, that’s why GPS, for example, is one of the main technologies that has been used. My question is, for all those and I include myself that has on top of mine garbage proposal and the managing of workloads, what would be your criticism to this model based on the publication you have made?
COMPARATION WITH GABBETT MODEL
A: Okey. The Gabbett model is an interesting approach because it’s a modern invention, is easy to understand, it’s very applicable every day because it’s easy to approach, but also a big limitation is that, that model is mono-dimensional so, you can look at one feature every time, soccer. The human is not mono-dimensional so the effort, the workload that you can give to your athletes is multidimensional so, looking at only one feature at a time is not enough for me to have a complete overview of the status of the players and understand the relationship between each feature. Using machine learning or using mono-dimensional approaches to analyze the workload for me is a better approach. It’s more complex yes, but you can have a complete overview of your players. More and more data you can have, the more accurate the evaluation of your players. Using machine learning can help to find a pattern into the multidimensional data that are related to each other and probably a changing of one feature could be not a point that you can see because probably other features do not change. So, we can look at a multidimensional change of the training workloads to better understand the effect of the frame. Gabbett approach is an old approach but could be useful if you don’t have probably a GPS, so it can give you an index of the status of their players or they workload performing the past, but we cannot limit on only these metrics to fully understand the effect of the training workload.
«Using machine learning can help to find a pattern into the multidimensional data […]»Alessio Rossi
I: Very interesting. The other question I have, one is what about the critique of what, let’s say currently done, but now the other question is on, always the complicated approach of transferring from theory to the practice, right? So, what you have published is a narrative article proposing a model with, as you have described, a lot of different steps, okay? my question is, what do you need that is necessary in time, in resources what you have been explaining and obviously other experts, authors and also companies that are working with machine learning, for example in soccer, what is necessary to bring this theory of machine learning to the reality of the clubs nowadays?
FROM THERORY TO PRACTICE OF MACHINE LAERNING
A: So, in this paper we described the way to apply machine learning so, in our previous paper on plus one in 2018, we provide an example of applying machine learning on injury prediction so, I think as you say before, the transferability depends on the capacity of explain your prediction so, as I said before, if you can say accurately that one player will get injured and we are also explained the reason why this player will get incorrect and provides some insight that could be useful for soccer training coaches to understand the status of the players and then probably modify their training scheduling in order to reduce this risk of injuries or improve the the effect of the training so, this is only an example of, injury prediction, we can use it for example, to predict there recovery status of the players so, understanding day by day which is the features that affect yours effort, your recovery status could help probably players, coaches to better create a program of trainings, to reduce the problem of players and provide some stimulus that produce the results that we want about our train so, for me the transferability depend on the ability of provide insight about our prediction.
«[…]applying machine learning on injury prediction […], useful for soccer training coaches to understand the status of the player and then probably modify their training scheduling in order to reduce the risk of injuries or improve the effect of the training […] or to predict the recovery status of the players […]»Alessio Rossi
I: And besides that, do you need that is necessary specific profiles like sports science scientists or specific technology inside the club, besides what obviously is the transferability of the decisions and what you mean?.
A: Yeah, I think for starting football clubs wants to create their own models. You need some expert about computer science or machine learning or big data, experts to create this model but there are plenty of courses online that give you the basics to programming these models, and then, there is some expert in the soccer club is required to continue in manager the algorithm, to improve their model, inserting new data, create a new matrix that could be more useful to your algorithm to detect your output, the output that you want. But there are also companies as you said that already use machine learning to predict so, you can use a data output and you don’t need a very expert about machine learning to understand the results. The ability of the companies, a good company could provide you insight that you can understand, all the sports science could understand and could use in their field, in their expertise and or what they need, so depending on the necessity of the future club
I: And the last question at this point is, based on your opinion, how far are we today from seeing that most of …, for example, football clubs, are using machine learning for, in that case, for example, injury prediction?
A: I think there are already teams that try to do that, based on creating machine learning models to predict injuries and other analysis, for example, so, one of the topics is scouting by using machine learning and scouting is a big topic from soccer clubs that want to develop, understand that to detect the best players for their team so, data science is now a hot topic for the soccer club, for example, in England, in the UK all the Premier League teams are looking for a data scientist in their own team. Probably is the fourth nation that there is a lot of interest in this property, in soccer or get try to improve their performance by using machine learning, probably not a sport, for example, an NBA or NFL there is a lot of data scientists working on this data from a long time, but in soccer, there is now a moment when they tried to understand the utility of machine learning of data science in their developing of their club.
I: It’s very interesting. And finally, that’s my last question, Alessio. In your model, obviously, which is widened so, your scope is obviously taking into account a lot of different factors, one of them is thermography, which is obviously, or expertise is on that technology. My question is, which role do you think thermography plays in a model of injury prediction?
ROLE OF THERMOGRAPHY IN INJURY PREDICTION
A: OK. We all know that thermography is wide use to understand injuries, to understand the region of your body where you could have an injury or a problem or muscular problem, so I think, that it could be very interesting information adding probably as new data above the GPS data, the wellness index, that could help to understand to increase the ability of detect injuries because, I think a based on the literature that I know, is a good index to understand the muscular status of players, so probably, I never try so I don’t know, but I think that adding this information on input data we can improve the ability to detect the injuries or also other element but for example, the recovery status or the readiness of the players so, it’s probably need more research about this topic try to insert this new data, this data also in models understanding which are the best features, the best characteristics that permit to detect the status of the players.
I: And among, obviously based on your work and your previous experience, among all those input technologies and data, which is in your opinion the most important when predicting injuries?
A: OK. Interesting question because we detect that the importance of the features changes team by team, players by players and also changing during the season so, we can detect that for example, the deceleration is very important to that injury in the first part of the season when probably you do the preparation the first part of the preparation, then reduce this importance and other features increase their importance so, depending on the bundle of the players, the request of the championships so, you have a lot of matches during a period, probably some details, features are more important than other that probably change during recovery status so, depending on the kind of training you do, your physiological status, the importance of the features changes. So, the good of machine learning is that they learn from their own data so, every team have different models, different rules to predict injuries or their outputs sports depending on their characteristic, of the coach the characteristic, the players, the character the championships so, we cannot generalize the rules to detect injuries, but we can personalize for each team, for each player if it´s possible, the ruled that detect for you the high risk of injuries.
«The good of machine learning is that they learn from their own data»Alessio Rossi
I: Very, very interesting Alessio. Thank you very, grace mile. It’s always very interesting to listen directly from the author an explanation of the topic and the paper you have published, so thank you very much once again.
A: Thank you, it’s a pleasure for me to explain my work. Thank you and goodbye to everyone.
I: Thank you, Ciao.