The Prediction Model

The long awaited prediction model is here. To give some detail about the prediction model, it is a multiple linear regression model which uses over 20 features to predict the lap time of a Formula 1 car. The current model implements a machine learning algorithm to improve the effectiveness and accuracy of the prediction. For future information and developments regarding the model, check out the plans page.

The goal of this model is to allow all of you to experiment with race strategies and predict the lap times for future races. Once you get your prediction, you can then compare both your strategy and the associated lap times to the actual race strategies used by the Formula 1 teams. The current model is in Python and I am working hard to implement the model in the website and make it user friendly, but it isn’t ready at the moment so I have come up with a solution. The solution entails filling out a spreadsheet, uploading it, and then I will run the prediction model and return you your results.

The button below includes the template for the prediction of race strategy.

Here are the guidelines for filling out the prediction model template.

  • For the tyre column, there are 4 options you can include S: soft, M: medium, H:hard, and P:pitstop. Make sure to follow the F1 rules of two different compounds for a race.
  • The second column in your control is the condition column which is whether the tyre is new or used. For this column, you can include N:new or U:used. I recommend just assuming the tyres are new to make it simple, but it is ultimately up to you guys to determine the strategy.
  • The model is case sensitive so please ensure that all your entries in the file are capitalized to ensure it is compatible with the model.
  • Please let me know for which team you want the strategy for because each team is a feature in the model so the predictions will differ from team to team.

Once you have filled out the template, upload the spreadsheet to Google Drive or a similar sharing platform and enable the public sharing setting that will allow us to view the content of the spreadsheet and then fill out the form below. Once, we receive your response to the form below, we will run your predictions through the model to give you predicted lap times for the next race. Following that, we will email the predicted lap times to the email including in the submission on the form below.

← Back

Thank you for your response. ✨

In terms of the limitations of the model, a clear limitation is predicting a strategy when a safety car occurs because we simply have no possible way of predicting when a safety car will occur. We can predict the likelihood a safety occurs during a race given previous races, but we have no ability to predict during the race when the safety car will be deployed. The model can predict the effect of a safety car during the race, but has quite limited. A major limitation of the model is predicting lap times for a rainy race. Since the current model is trained on the data from previous races this season, there is not a single instance of rain occurring so we have no reference for rain and will only be able to compare predictions after the data from the session is included in the model.

In terms of the methodology, we will include a separate page to explain the model fully, but here is a short run down on the model. The model is multiple linear regression model with over 20 features with over 3000 data points from the previous races. The model uses a machine learning algorithm, gradient boosting for anyone curious about it, to increase prediction accuracy. The model is trained and tested using data from previous races and will become better at predicting lap times as the season progresses. The only data being used for this model is from the 2024 F1 season since the performance of F1 cars differs too much from season to season making race data from last season less applicable for predicting lap times.

For those interested in the data, I included the MSE, MAE, and R-sqaured data from the model below:

To wrap everything up, I hope to see many submissions of race predictions and we will work as quickly as we can to return predicted lap times. In terms of the prediction model, we will continue to develop the model as the season progresses and hope to decrease the variability. For those interested in data science or who just want to work on the project, consider applying on the Join The Team page.