{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Formula One Project: Modeling\n", "\n", "DUE: December 4th, 2024 (Wed) \n", "Name(s): Sean O'Connor, Connor Coles \n", "Class: CSCI 349 - Intro to Data Mining \n", "Semester: Fall 2024 \n", "Instructor: Brian King " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Assignment Description\n", "\n", "Copy over the important cells from the previous step that read in and cleaned your data to this new notebook file. You do not need to copy over all your EDA and plots describing your data, only the code that prepares your data for modeling. This notebook is about exploring the development of predictive models. Some initial preliminary work on applying some modeling techniques should be completed.\n", "Be sure to commit and push all supporting code that you've completed in this file. Include in this notebook a summary cell at the top that details your accomplishments, challenges, and what you expect to accomplish for your final steps. Be sure to update your readme.md in your repository.\n", "\n", "## Progress Summary\n", "\n", "### Accomplishments So Far\n", "- Successfully loaded and preprocessed Formula 1 race data from 2021-2024\n", "- Created comprehensive feature engineering pipeline including weather and track conditions\n", "- Implemented initial modeling with Random Forest, XGBoost, and Gradient Boosting\n", "- Achieved best performance on Belgian GP (R² = 0.775) and Mexico City GP (R² = 0.505)\n", "\n", "### Challenges Faced\n", "- High variability in model performance across different tracks\n", "- British GP proving particularly difficult to predict (best R² = 0.047)\n", "- Complex interactions between weather variables and lap times\n", "- Limited data availability for some races/conditions\n", "\n", "### Next Steps\n", "- Implement hyperparameter tuning using GridSearchCV\n", "- Explore additional feature engineering possibilities\n", "- Test neural network approaches for complex weather-performance relationships\n", "- Create ensemble model combining best performers for each track\n", "- Prepare final visualizations and analysis for report\n", "\n", "## Data Preparation and Feature Engineering" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Importing Libraries\n", "import logging\n", "import os\n", "import warnings\n", "\n", "import fastf1\n", "import fastf1.plotting\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "import xgboost as xgb\n", "from catboost import CatBoostRegressor\n", "from fastf1.ergast.structure import FastestLap\n", "from lightgbm import LGBMRegressor\n", "from sklearn.compose import ColumnTransformer\n", "from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor\n", "from sklearn.impute import SimpleImputer\n", "from sklearn.linear_model import Lasso, LinearRegression, Ridge\n", "from sklearn.metrics import (make_scorer, mean_absolute_error,\n", " mean_squared_error, r2_score)\n", "from sklearn.model_selection import (cross_val_score, cross_validate,\n", " train_test_split)\n", "from sklearn.pipeline import Pipeline, make_pipeline\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.svm import SVR\n", "from sklearn.tree import DecisionTreeRegressor\n", "from xgboost import XGBRegressor" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " RoundNumber Country Location \\\n", "0 0 Bahrain Bahrain \n", "1 1 Bahrain Sakhir \n", "2 2 Italy Imola \n", "3 3 Portugal Portimão \n", "4 4 Spain Barcelona \n", "5 5 Monaco Monte Carlo \n", "6 6 Azerbaijan Baku \n", "7 7 France Le Castellet \n", "8 8 Austria Spielberg \n", "9 9 Austria Spielberg \n", "10 10 Great Britain Silverstone \n", "11 11 Hungary Budapest \n", "12 12 Belgium Spa-Francorchamps \n", "13 13 Netherlands Zandvoort \n", "14 14 Italy Monza \n", "15 15 Russia Sochi \n", "16 16 Turkey Istanbul \n", "17 17 United States Austin \n", "18 18 Mexico Mexico City \n", "19 19 Brazil São Paulo \n", "20 20 Qatar Lusail \n", "21 21 Saudi Arabia Jeddah \n", "22 22 Abu Dhabi Yas Island \n", "\n", " OfficialEventName EventDate \\\n", "0 FORMULA 1 ARAMCO PRE-SEASON TESTING 2021 2021-03-14 \n", "1 FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 2021-03-28 \n", "2 FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITAL... 2021-04-18 \n", "3 FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 2021-05-02 \n", "4 FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 2021-05-09 \n", "5 FORMULA 1 GRAND PRIX DE MONACO 2021 2021-05-23 \n", "6 FORMULA 1 AZERBAIJAN GRAND PRIX 2021 2021-06-06 \n", "7 FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 2021-06-20 \n", "8 FORMULA 1 BWT GROSSER PREIS DER STEIERMARK 2021 2021-06-27 \n", "9 FORMULA 1 BWT GROSSER PREIS VON ÖSTERREICH 2021 2021-07-04 \n", "10 FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 2021-07-18 \n", "11 FORMULA 1 ROLEX MAGYAR NAGYDÍJ 2021 2021-08-01 \n", "12 FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 2021-08-29 \n", "13 FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 2021-09-05 \n", "14 FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 2021-09-12 \n", "15 FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 2021-09-26 \n", "16 FORMULA 1 ROLEX TURKISH GRAND PRIX 2021 2021-10-10 \n", "17 FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 2021-10-24 \n", "18 FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 2021-11-07 \n", "19 FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO ... 2021-11-14 \n", "20 FORMULA 1 OOREDOO QATAR GRAND PRIX 2021 2021-11-21 \n", "21 FORMULA 1 STC SAUDI ARABIAN GRAND PRIX 2021 2021-12-05 \n", "22 FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX ... 2021-12-12 \n", "\n", " EventName EventFormat Session1 \\\n", "0 Pre-Season Test testing Practice 1 \n", "1 Bahrain Grand Prix conventional Practice 1 \n", "2 Emilia Romagna Grand Prix conventional Practice 1 \n", "3 Portuguese Grand Prix conventional Practice 1 \n", "4 Spanish Grand Prix conventional Practice 1 \n", "5 Monaco Grand Prix conventional Practice 1 \n", "6 Azerbaijan Grand Prix conventional Practice 1 \n", "7 French Grand Prix conventional Practice 1 \n", "8 Styrian Grand Prix conventional Practice 1 \n", "9 Austrian Grand Prix conventional Practice 1 \n", "10 British Grand Prix sprint Practice 1 \n", "11 Hungarian Grand Prix conventional Practice 1 \n", "12 Belgian Grand Prix conventional Practice 1 \n", "13 Dutch Grand Prix conventional Practice 1 \n", "14 Italian Grand Prix sprint Practice 1 \n", "15 Russian Grand Prix conventional Practice 1 \n", "16 Turkish Grand Prix conventional Practice 1 \n", "17 United States Grand Prix conventional Practice 1 \n", "18 Mexico City Grand Prix conventional Practice 1 \n", "19 São Paulo Grand Prix sprint Practice 1 \n", "20 Qatar Grand Prix conventional Practice 1 \n", "21 Saudi Arabian Grand Prix conventional Practice 1 \n", "22 Abu Dhabi Grand Prix conventional Practice 1 \n", "\n", " Session1Date Session1DateUtc ... Session3 \\\n", "0 2021-03-12 10:00:00+03:00 2021-03-12 07:00:00 ... Practice 3 \n", "1 2021-03-26 14:30:00+03:00 2021-03-26 11:30:00 ... Practice 3 \n", "2 2021-04-16 11:00:00+02:00 2021-04-16 09:00:00 ... Practice 3 \n", "3 2021-04-30 11:30:00+01:00 2021-04-30 10:30:00 ... Practice 3 \n", "4 2021-05-07 11:30:00+02:00 2021-05-07 09:30:00 ... Practice 3 \n", "5 2021-05-20 11:30:00+02:00 2021-05-20 09:30:00 ... Practice 3 \n", "6 2021-06-04 12:30:00+04:00 2021-06-04 08:30:00 ... Practice 3 \n", "7 2021-06-18 11:30:00+02:00 2021-06-18 09:30:00 ... Practice 3 \n", "8 2021-06-25 11:30:00+02:00 2021-06-25 09:30:00 ... Practice 3 \n", "9 2021-07-02 11:30:00+02:00 2021-07-02 09:30:00 ... Practice 3 \n", "10 2021-07-16 14:30:00+01:00 2021-07-16 13:30:00 ... Practice 2 \n", "11 2021-07-30 11:30:00+02:00 2021-07-30 09:30:00 ... Practice 3 \n", "12 2021-08-27 11:30:00+02:00 2021-08-27 09:30:00 ... Practice 3 \n", "13 2021-09-03 11:30:00+02:00 2021-09-03 09:30:00 ... Practice 3 \n", "14 2021-09-10 14:30:00+02:00 2021-09-10 12:30:00 ... Practice 2 \n", "15 2021-09-24 11:30:00+03:00 2021-09-24 08:30:00 ... Practice 3 \n", "16 2021-10-08 11:30:00+03:00 2021-10-08 08:30:00 ... Practice 3 \n", "17 2021-10-22 11:30:00-05:00 2021-10-22 16:30:00 ... Practice 3 \n", "18 2021-11-05 11:30:00-06:00 2021-11-05 17:30:00 ... Practice 3 \n", "19 2021-11-12 12:30:00-03:00 2021-11-12 15:30:00 ... Practice 2 \n", "20 2021-11-19 13:30:00+03:00 2021-11-19 10:30:00 ... Practice 3 \n", "21 2021-12-03 16:30:00+03:00 2021-12-03 13:30:00 ... Practice 3 \n", "22 2021-12-10 13:30:00+04:00 2021-12-10 09:30:00 ... Practice 3 \n", "\n", " Session3Date Session3DateUtc Session4 \\\n", "0 2021-03-14 10:00:00+03:00 2021-03-14 07:00:00 None \n", "1 2021-03-27 15:00:00+03:00 2021-03-27 12:00:00 Qualifying \n", "2 2021-04-17 11:00:00+02:00 2021-04-17 09:00:00 Qualifying \n", "3 2021-05-01 12:00:00+01:00 2021-05-01 11:00:00 Qualifying \n", "4 2021-05-08 12:00:00+02:00 2021-05-08 10:00:00 Qualifying \n", "5 2021-05-22 12:00:00+02:00 2021-05-22 10:00:00 Qualifying \n", "6 2021-06-05 13:00:00+04:00 2021-06-05 09:00:00 Qualifying \n", "7 2021-06-19 12:00:00+02:00 2021-06-19 10:00:00 Qualifying \n", "8 2021-06-26 12:00:00+02:00 2021-06-26 10:00:00 Qualifying \n", "9 2021-07-03 12:00:00+02:00 2021-07-03 10:00:00 Qualifying \n", "10 2021-07-17 12:00:00+01:00 2021-07-17 11:00:00 Sprint \n", "11 2021-07-31 12:00:00+02:00 2021-07-31 10:00:00 Qualifying \n", "12 2021-08-28 12:00:00+02:00 2021-08-28 10:00:00 Qualifying \n", "13 2021-09-04 12:00:00+02:00 2021-09-04 10:00:00 Qualifying \n", "14 2021-09-11 12:00:00+02:00 2021-09-11 10:00:00 Sprint \n", "15 2021-09-25 12:00:00+03:00 2021-09-25 09:00:00 Qualifying \n", "16 2021-10-09 12:00:00+03:00 2021-10-09 09:00:00 Qualifying \n", "17 2021-10-23 13:00:00-05:00 2021-10-23 18:00:00 Qualifying \n", "18 2021-11-06 11:00:00-06:00 2021-11-06 17:00:00 Qualifying \n", "19 2021-11-13 12:00:00-03:00 2021-11-13 15:00:00 Sprint \n", "20 2021-11-20 14:00:00+03:00 2021-11-20 11:00:00 Qualifying \n", "21 2021-12-04 17:00:00+03:00 2021-12-04 14:00:00 Qualifying \n", "22 2021-12-11 14:00:00+04:00 2021-12-11 10:00:00 Qualifying \n", "\n", " Session4Date Session4DateUtc Session5 \\\n", "0 NaT NaT None \n", "1 2021-03-27 18:00:00+03:00 2021-03-27 15:00:00 Race \n", "2 2021-04-17 14:00:00+02:00 2021-04-17 12:00:00 Race \n", "3 2021-05-01 15:00:00+01:00 2021-05-01 14:00:00 Race \n", "4 2021-05-08 15:00:00+02:00 2021-05-08 13:00:00 Race \n", "5 2021-05-22 15:00:00+02:00 2021-05-22 13:00:00 Race \n", "6 2021-06-05 16:00:00+04:00 2021-06-05 12:00:00 Race \n", "7 2021-06-19 15:00:00+02:00 2021-06-19 13:00:00 Race \n", "8 2021-06-26 15:00:00+02:00 2021-06-26 13:00:00 Race \n", "9 2021-07-03 15:00:00+02:00 2021-07-03 13:00:00 Race \n", "10 2021-07-17 16:30:00+01:00 2021-07-17 15:30:00 Race \n", "11 2021-07-31 15:00:00+02:00 2021-07-31 13:00:00 Race \n", "12 2021-08-28 15:00:00+02:00 2021-08-28 13:00:00 Race \n", "13 2021-09-04 15:00:00+02:00 2021-09-04 13:00:00 Race \n", "14 2021-09-11 16:30:00+02:00 2021-09-11 14:30:00 Race \n", "15 2021-09-25 15:00:00+03:00 2021-09-25 12:00:00 Race \n", "16 2021-10-09 15:00:00+03:00 2021-10-09 12:00:00 Race \n", "17 2021-10-23 16:00:00-05:00 2021-10-23 21:00:00 Race \n", "18 2021-11-06 14:00:00-06:00 2021-11-06 20:00:00 Race \n", "19 2021-11-13 16:30:00-03:00 2021-11-13 19:30:00 Race \n", "20 2021-11-20 17:00:00+03:00 2021-11-20 14:00:00 Race \n", "21 2021-12-04 20:00:00+03:00 2021-12-04 17:00:00 Race \n", "22 2021-12-11 17:00:00+04:00 2021-12-11 13:00:00 Race \n", "\n", " Session5Date Session5DateUtc F1ApiSupport \n", "0 NaT NaT False \n", "1 2021-03-28 18:00:00+03:00 2021-03-28 15:00:00 True \n", "2 2021-04-18 15:00:00+02:00 2021-04-18 13:00:00 True \n", "3 2021-05-02 15:00:00+01:00 2021-05-02 14:00:00 True \n", "4 2021-05-09 15:00:00+02:00 2021-05-09 13:00:00 True \n", "5 2021-05-23 15:00:00+02:00 2021-05-23 13:00:00 True \n", "6 2021-06-06 16:00:00+04:00 2021-06-06 12:00:00 True \n", "7 2021-06-20 15:00:00+02:00 2021-06-20 13:00:00 True \n", "8 2021-06-27 15:00:00+02:00 2021-06-27 13:00:00 True \n", "9 2021-07-04 15:00:00+02:00 2021-07-04 13:00:00 True \n", "10 2021-07-18 15:00:00+01:00 2021-07-18 14:00:00 True \n", "11 2021-08-01 15:00:00+02:00 2021-08-01 13:00:00 True \n", "12 2021-08-29 15:00:00+02:00 2021-08-29 13:00:00 True \n", "13 2021-09-05 15:00:00+02:00 2021-09-05 13:00:00 True \n", "14 2021-09-12 15:00:00+02:00 2021-09-12 13:00:00 True \n", "15 2021-09-26 15:00:00+03:00 2021-09-26 12:00:00 True \n", "16 2021-10-10 15:00:00+03:00 2021-10-10 12:00:00 True \n", "17 2021-10-24 14:00:00-05:00 2021-10-24 19:00:00 True \n", "18 2021-11-07 13:00:00-06:00 2021-11-07 19:00:00 True \n", "19 2021-11-14 14:00:00-03:00 2021-11-14 17:00:00 True \n", "20 2021-11-21 17:00:00+03:00 2021-11-21 14:00:00 True \n", "21 2021-12-05 20:30:00+03:00 2021-12-05 17:30:00 True \n", "22 2021-12-12 17:00:00+04:00 2021-12-12 13:00:00 True \n", "\n", "[23 rows x 23 columns]\n", " RoundNumber Country Location \\\n", "0 0 Spain Spain \n", "1 0 Bahrain Bahrain \n", "2 1 Bahrain Sakhir \n", "3 2 Saudi Arabia Jeddah \n", "4 3 Australia Melbourne \n", "5 4 Italy Imola \n", "6 5 United States Miami \n", "7 6 Spain Barcelona \n", "8 7 Monaco Monaco \n", "9 8 Azerbaijan Baku \n", "10 9 Canada Montréal \n", "11 10 Great Britain Silverstone \n", "12 11 Austria Spielberg \n", "13 12 France Le Castellet \n", "14 13 Hungary Budapest \n", "15 14 Belgium Spa-Francorchamps \n", "16 15 Netherlands Zandvoort \n", "17 16 Italy Monza \n", "18 17 Singapore Marina Bay \n", "19 18 Japan Suzuka \n", "20 19 United States Austin \n", "21 20 Mexico Mexico City \n", "22 21 Brazil São Paulo \n", "23 22 Abu Dhabi Yas Island \n", "\n", " OfficialEventName EventDate \\\n", "0 FORMULA 1 PRE-SEASON TRACK SESSION 2022 2022-02-25 \n", "1 FORMULA 1 ARAMCO PRE-SEASON TESTING 2022 2022-03-12 \n", "2 FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2022 2022-03-20 \n", "3 FORMULA 1 STC SAUDI ARABIAN GRAND PRIX 2022 2022-03-27 \n", "4 FORMULA 1 HEINEKEN AUSTRALIAN GRAND PRIX 2022 2022-04-10 \n", "5 FORMULA 1 ROLEX GRAN PREMIO DEL MADE IN ITALY ... 2022-04-24 \n", "6 FORMULA 1 CRYPTO.COM MIAMI GRAND PRIX 2022 2022-05-08 \n", "7 FORMULA 1 PIRELLI GRAN PREMIO DE ESPAÑA 2022 2022-05-22 \n", "8 FORMULA 1 GRAND PRIX DE MONACO 2022 2022-05-29 \n", "9 FORMULA 1 AZERBAIJAN GRAND PRIX 2022 2022-06-12 \n", "10 FORMULA 1 AWS GRAND PRIX DU CANADA 2022 2022-06-19 \n", "11 FORMULA 1 LENOVO BRITISH GRAND PRIX 2022 2022-07-03 \n", "12 FORMULA 1 ROLEX GROSSER PREIS VON ÖSTERREICH 2022 2022-07-10 \n", "13 FORMULA 1 LENOVO GRAND PRIX DE FRANCE 2022 2022-07-24 \n", "14 FORMULA 1 ARAMCO MAGYAR NAGYDÍJ 2022 2022-07-31 \n", "15 FORMULA 1 ROLEX BELGIAN GRAND PRIX 2022 2022-08-28 \n", "16 FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2022 2022-09-04 \n", "17 FORMULA 1 PIRELLI GRAN PREMIO D’ITALIA 2022 2022-09-11 \n", "18 FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND P... 2022-10-02 \n", "19 FORMULA 1 HONDA JAPANESE GRAND PRIX 2022 2022-10-09 \n", "20 FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2022 2022-10-23 \n", "21 FORMULA 1 HEINEKEN GRAN PREMIO DE LA CIUDAD DE... 2022-10-30 \n", "22 FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO ... 2022-11-13 \n", "23 FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX ... 2022-11-20 \n", "\n", " EventName EventFormat Session1 \\\n", "0 Pre-Season Track Session testing Practice 1 \n", "1 Pre-Season Test testing Practice 1 \n", "2 Bahrain Grand Prix conventional Practice 1 \n", "3 Saudi Arabian Grand Prix conventional Practice 1 \n", "4 Australian Grand Prix conventional Practice 1 \n", "5 Emilia Romagna Grand Prix sprint Practice 1 \n", "6 Miami Grand Prix conventional Practice 1 \n", "7 Spanish Grand Prix conventional Practice 1 \n", "8 Monaco Grand Prix conventional Practice 1 \n", "9 Azerbaijan Grand Prix conventional Practice 1 \n", "10 Canadian Grand Prix conventional Practice 1 \n", "11 British Grand Prix conventional Practice 1 \n", "12 Austrian Grand Prix sprint Practice 1 \n", "13 French Grand Prix conventional Practice 1 \n", "14 Hungarian Grand Prix conventional Practice 1 \n", "15 Belgian Grand Prix conventional Practice 1 \n", "16 Dutch Grand Prix conventional Practice 1 \n", "17 Italian Grand Prix conventional Practice 1 \n", "18 Singapore Grand Prix conventional Practice 1 \n", "19 Japanese Grand Prix conventional Practice 1 \n", "20 United States Grand Prix conventional Practice 1 \n", "21 Mexico City Grand Prix conventional Practice 1 \n", "22 São Paulo Grand Prix sprint Practice 1 \n", "23 Abu Dhabi Grand Prix conventional Practice 1 \n", "\n", " Session1Date Session1DateUtc ... Session3 \\\n", "0 2022-02-23 09:00:00+01:00 2022-02-23 08:00:00 ... Practice 3 \n", "1 2022-03-10 10:00:00+03:00 2022-03-10 07:00:00 ... Practice 3 \n", "2 2022-03-18 15:00:00+03:00 2022-03-18 12:00:00 ... Practice 3 \n", "3 2022-03-25 17:00:00+03:00 2022-03-25 14:00:00 ... Practice 3 \n", "4 2022-04-08 13:00:00+10:00 2022-04-08 03:00:00 ... Practice 3 \n", "5 2022-04-22 13:30:00+02:00 2022-04-22 11:30:00 ... Practice 2 \n", "6 2022-05-06 14:30:00-04:00 2022-05-06 18:30:00 ... Practice 3 \n", "7 2022-05-20 14:00:00+02:00 2022-05-20 12:00:00 ... Practice 3 \n", "8 2022-05-27 14:00:00+02:00 2022-05-27 12:00:00 ... Practice 3 \n", "9 2022-06-10 15:00:00+04:00 2022-06-10 11:00:00 ... Practice 3 \n", "10 2022-06-17 14:00:00-04:00 2022-06-17 18:00:00 ... Practice 3 \n", "11 2022-07-01 13:00:00+01:00 2022-07-01 12:00:00 ... Practice 3 \n", "12 2022-07-08 13:30:00+02:00 2022-07-08 11:30:00 ... Practice 2 \n", "13 2022-07-22 14:00:00+02:00 2022-07-22 12:00:00 ... Practice 3 \n", "14 2022-07-29 14:00:00+02:00 2022-07-29 12:00:00 ... Practice 3 \n", "15 2022-08-26 14:00:00+02:00 2022-08-26 12:00:00 ... Practice 3 \n", "16 2022-09-02 12:30:00+02:00 2022-09-02 10:30:00 ... Practice 3 \n", "17 2022-09-09 14:00:00+02:00 2022-09-09 12:00:00 ... Practice 3 \n", "18 2022-09-30 18:00:00+08:00 2022-09-30 10:00:00 ... Practice 3 \n", "19 2022-10-07 12:00:00+09:00 2022-10-07 03:00:00 ... Practice 3 \n", "20 2022-10-21 14:00:00-05:00 2022-10-21 19:00:00 ... Practice 3 \n", "21 2022-10-28 13:00:00-06:00 2022-10-28 19:00:00 ... Practice 3 \n", "22 2022-11-11 12:30:00-03:00 2022-11-11 15:30:00 ... Practice 2 \n", "23 2022-11-18 14:00:00+04:00 2022-11-18 10:00:00 ... Practice 3 \n", "\n", " Session3Date Session3DateUtc Session4 \\\n", "0 2022-02-25 09:00:00+01:00 2022-02-25 08:00:00 None \n", "1 2022-03-12 10:00:00+03:00 2022-03-12 07:00:00 None \n", "2 2022-03-19 15:00:00+03:00 2022-03-19 12:00:00 Qualifying \n", "3 2022-03-26 17:00:00+03:00 2022-03-26 14:00:00 Qualifying \n", "4 2022-04-09 13:00:00+10:00 2022-04-09 03:00:00 Qualifying \n", "5 2022-04-23 12:30:00+02:00 2022-04-23 10:30:00 Sprint \n", "6 2022-05-07 13:00:00-04:00 2022-05-07 17:00:00 Qualifying \n", "7 2022-05-21 13:00:00+02:00 2022-05-21 11:00:00 Qualifying \n", "8 2022-05-28 13:00:00+02:00 2022-05-28 11:00:00 Qualifying \n", "9 2022-06-11 15:00:00+04:00 2022-06-11 11:00:00 Qualifying \n", "10 2022-06-18 13:00:00-04:00 2022-06-18 17:00:00 Qualifying \n", "11 2022-07-02 12:00:00+01:00 2022-07-02 11:00:00 Qualifying \n", "12 2022-07-09 12:30:00+02:00 2022-07-09 10:30:00 Sprint \n", "13 2022-07-23 13:00:00+02:00 2022-07-23 11:00:00 Qualifying \n", "14 2022-07-30 13:00:00+02:00 2022-07-30 11:00:00 Qualifying \n", "15 2022-08-27 13:00:00+02:00 2022-08-27 11:00:00 Qualifying \n", "16 2022-09-03 12:00:00+02:00 2022-09-03 10:00:00 Qualifying \n", "17 2022-09-10 13:00:00+02:00 2022-09-10 11:00:00 Qualifying \n", "18 2022-10-01 18:00:00+08:00 2022-10-01 10:00:00 Qualifying \n", "19 2022-10-08 12:00:00+09:00 2022-10-08 03:00:00 Qualifying \n", "20 2022-10-22 14:00:00-05:00 2022-10-22 19:00:00 Qualifying \n", "21 2022-10-29 12:00:00-06:00 2022-10-29 18:00:00 Qualifying \n", "22 2022-11-12 12:30:00-03:00 2022-11-12 15:30:00 Sprint \n", "23 2022-11-19 14:30:00+04:00 2022-11-19 10:30:00 Qualifying \n", "\n", " Session4Date Session4DateUtc Session5 \\\n", "0 NaT NaT None \n", "1 NaT NaT None \n", "2 2022-03-19 18:00:00+03:00 2022-03-19 15:00:00 Race \n", "3 2022-03-26 20:00:00+03:00 2022-03-26 17:00:00 Race \n", "4 2022-04-09 16:00:00+10:00 2022-04-09 06:00:00 Race \n", "5 2022-04-23 16:30:00+02:00 2022-04-23 14:30:00 Race \n", "6 2022-05-07 16:00:00-04:00 2022-05-07 20:00:00 Race \n", "7 2022-05-21 16:00:00+02:00 2022-05-21 14:00:00 Race \n", "8 2022-05-28 16:00:00+02:00 2022-05-28 14:00:00 Race \n", "9 2022-06-11 18:00:00+04:00 2022-06-11 14:00:00 Race \n", "10 2022-06-18 16:00:00-04:00 2022-06-18 20:00:00 Race \n", "11 2022-07-02 15:00:00+01:00 2022-07-02 14:00:00 Race \n", "12 2022-07-09 16:30:00+02:00 2022-07-09 14:30:00 Race \n", "13 2022-07-23 16:00:00+02:00 2022-07-23 14:00:00 Race \n", "14 2022-07-30 16:00:00+02:00 2022-07-30 14:00:00 Race \n", "15 2022-08-27 16:00:00+02:00 2022-08-27 14:00:00 Race \n", "16 2022-09-03 15:00:00+02:00 2022-09-03 13:00:00 Race \n", "17 2022-09-10 16:00:00+02:00 2022-09-10 14:00:00 Race \n", "18 2022-10-01 21:00:00+08:00 2022-10-01 13:00:00 Race \n", "19 2022-10-08 15:00:00+09:00 2022-10-08 06:00:00 Race \n", "20 2022-10-22 17:00:00-05:00 2022-10-22 22:00:00 Race \n", "21 2022-10-29 15:00:00-06:00 2022-10-29 21:00:00 Race \n", "22 2022-11-12 16:30:00-03:00 2022-11-12 19:30:00 Race \n", "23 2022-11-19 18:00:00+04:00 2022-11-19 14:00:00 Race \n", "\n", " Session5Date Session5DateUtc F1ApiSupport \n", "0 NaT NaT False \n", "1 NaT NaT True \n", "2 2022-03-20 18:00:00+03:00 2022-03-20 15:00:00 True \n", "3 2022-03-27 20:00:00+03:00 2022-03-27 17:00:00 True \n", "4 2022-04-10 15:00:00+10:00 2022-04-10 05:00:00 True \n", "5 2022-04-24 15:00:00+02:00 2022-04-24 13:00:00 True \n", "6 2022-05-08 15:30:00-04:00 2022-05-08 19:30:00 True \n", "7 2022-05-22 15:00:00+02:00 2022-05-22 13:00:00 True \n", "8 2022-05-29 15:00:00+02:00 2022-05-29 13:00:00 True \n", "9 2022-06-12 15:00:00+04:00 2022-06-12 11:00:00 True \n", "10 2022-06-19 14:00:00-04:00 2022-06-19 18:00:00 True \n", "11 2022-07-03 15:00:00+01:00 2022-07-03 14:00:00 True \n", "12 2022-07-10 15:00:00+02:00 2022-07-10 13:00:00 True \n", "13 2022-07-24 15:00:00+02:00 2022-07-24 13:00:00 True \n", "14 2022-07-31 15:00:00+02:00 2022-07-31 13:00:00 True \n", "15 2022-08-28 15:00:00+02:00 2022-08-28 13:00:00 True \n", "16 2022-09-04 15:00:00+02:00 2022-09-04 13:00:00 True \n", "17 2022-09-11 15:00:00+02:00 2022-09-11 13:00:00 True \n", "18 2022-10-02 20:00:00+08:00 2022-10-02 12:00:00 True \n", "19 2022-10-09 14:00:00+09:00 2022-10-09 05:00:00 True \n", "20 2022-10-23 14:00:00-05:00 2022-10-23 19:00:00 True \n", "21 2022-10-30 14:00:00-06:00 2022-10-30 20:00:00 True \n", "22 2022-11-13 15:00:00-03:00 2022-11-13 18:00:00 True \n", "23 2022-11-20 17:00:00+04:00 2022-11-20 13:00:00 True \n", "\n", "[24 rows x 23 columns]\n", " RoundNumber Country Location \\\n", "0 0 Bahrain Sakhir \n", "1 1 Bahrain Sakhir \n", "2 2 Saudi Arabia Jeddah \n", "3 3 Australia Melbourne \n", "4 4 Azerbaijan Baku \n", "5 5 United States Miami \n", "6 6 Monaco Monaco \n", "7 7 Spain Barcelona \n", "8 8 Canada Montréal \n", "9 9 Austria Spielberg \n", "10 10 Great Britain Silverstone \n", "11 11 Hungary Budapest \n", "12 12 Belgium Spa-Francorchamps \n", "13 13 Netherlands Zandvoort \n", "14 14 Italy Monza \n", "15 15 Singapore Marina Bay \n", "16 16 Japan Suzuka \n", "17 17 Qatar Lusail \n", "18 18 United States Austin \n", "19 19 Mexico Mexico City \n", "20 20 Brazil São Paulo \n", "21 21 United States Las Vegas \n", "22 22 Abu Dhabi Yas Island \n", "\n", " OfficialEventName EventDate \\\n", "0 FORMULA 1 ARAMCO PRE-SEASON TESTING 2023 2023-02-25 \n", "1 FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2023 2023-03-05 \n", "2 FORMULA 1 STC SAUDI ARABIAN GRAND PRIX 2023 2023-03-19 \n", "3 FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2023 2023-04-02 \n", "4 FORMULA 1 AZERBAIJAN GRAND PRIX 2023 2023-04-30 \n", "5 FORMULA 1 CRYPTO.COM MIAMI GRAND PRIX 2023 2023-05-07 \n", "6 FORMULA 1 GRAND PRIX DE MONACO 2023 2023-05-28 \n", "7 FORMULA 1 AWS GRAN PREMIO DE ESPAÑA 2023 2023-06-04 \n", "8 FORMULA 1 PIRELLI GRAND PRIX DU CANADA 2023 2023-06-18 \n", "9 FORMULA 1 ROLEX GROSSER PREIS VON ÖSTERREICH 2023 2023-07-02 \n", "10 FORMULA 1 ARAMCO BRITISH GRAND PRIX 2023 2023-07-09 \n", "11 FORMULA 1 QATAR AIRWAYS HUNGARIAN GRAND PRIX 2023 2023-07-23 \n", "12 FORMULA 1 MSC CRUISES BELGIAN GRAND PRIX 2023 2023-07-30 \n", "13 FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2023 2023-08-27 \n", "14 FORMULA 1 PIRELLI GRAN PREMIO D’ITALIA 2023 2023-09-03 \n", "15 FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND P... 2023-09-17 \n", "16 FORMULA 1 LENOVO JAPANESE GRAND PRIX 2023 2023-09-24 \n", "17 FORMULA 1 QATAR AIRWAYS QATAR GRAND PRIX 2023 2023-10-08 \n", "18 FORMULA 1 LENOVO UNITED STATES GRAND PRIX 2023 2023-10-22 \n", "19 FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2023 2023-10-29 \n", "20 FORMULA 1 ROLEX GRANDE PRÊMIO DE SÃO PAULO 2023 2023-11-05 \n", "21 FORMULA 1 HEINEKEN SILVER LAS VEGAS GRAND PRIX... 2023-11-18 \n", "22 FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX ... 2023-11-26 \n", "\n", " EventName EventFormat Session1 \\\n", "0 Pre-Season Testing testing Practice 1 \n", "1 Bahrain Grand Prix conventional Practice 1 \n", "2 Saudi Arabian Grand Prix conventional Practice 1 \n", "3 Australian Grand Prix conventional Practice 1 \n", "4 Azerbaijan Grand Prix sprint_shootout Practice 1 \n", "5 Miami Grand Prix conventional Practice 1 \n", "6 Monaco Grand Prix conventional Practice 1 \n", "7 Spanish Grand Prix conventional Practice 1 \n", "8 Canadian Grand Prix conventional Practice 1 \n", "9 Austrian Grand Prix sprint_shootout Practice 1 \n", "10 British Grand Prix conventional Practice 1 \n", "11 Hungarian Grand Prix conventional Practice 1 \n", "12 Belgian Grand Prix sprint_shootout Practice 1 \n", "13 Dutch Grand Prix conventional Practice 1 \n", "14 Italian Grand Prix conventional Practice 1 \n", "15 Singapore Grand Prix conventional Practice 1 \n", "16 Japanese Grand Prix conventional Practice 1 \n", "17 Qatar Grand Prix sprint_shootout Practice 1 \n", "18 United States Grand Prix sprint_shootout Practice 1 \n", "19 Mexico City Grand Prix conventional Practice 1 \n", "20 São Paulo Grand Prix sprint_shootout Practice 1 \n", "21 Las Vegas Grand Prix conventional Practice 1 \n", "22 Abu Dhabi Grand Prix conventional Practice 1 \n", "\n", " Session1Date Session1DateUtc ... Session3 \\\n", "0 2023-02-23 10:00:00+03:00 2023-02-23 07:00:00 ... Practice 3 \n", "1 2023-03-03 14:30:00+03:00 2023-03-03 11:30:00 ... Practice 3 \n", "2 2023-03-17 16:30:00+03:00 2023-03-17 13:30:00 ... Practice 3 \n", "3 2023-03-31 12:30:00+10:00 2023-03-31 02:30:00 ... Practice 3 \n", "4 2023-04-28 13:30:00+04:00 2023-04-28 09:30:00 ... Sprint Shootout \n", "5 2023-05-05 14:00:00-04:00 2023-05-05 18:00:00 ... Practice 3 \n", "6 2023-05-26 13:30:00+02:00 2023-05-26 11:30:00 ... Practice 3 \n", "7 2023-06-02 13:30:00+02:00 2023-06-02 11:30:00 ... Practice 3 \n", "8 2023-06-16 13:30:00-04:00 2023-06-16 17:30:00 ... Practice 3 \n", "9 2023-06-30 13:30:00+02:00 2023-06-30 11:30:00 ... Sprint Shootout \n", "10 2023-07-07 12:30:00+01:00 2023-07-07 11:30:00 ... Practice 3 \n", "11 2023-07-21 13:30:00+02:00 2023-07-21 11:30:00 ... Practice 3 \n", "12 2023-07-28 13:30:00+02:00 2023-07-28 11:30:00 ... Sprint Shootout \n", "13 2023-08-25 12:30:00+02:00 2023-08-25 10:30:00 ... Practice 3 \n", "14 2023-09-01 13:30:00+02:00 2023-09-01 11:30:00 ... Practice 3 \n", "15 2023-09-15 17:30:00+08:00 2023-09-15 09:30:00 ... Practice 3 \n", "16 2023-09-22 11:30:00+09:00 2023-09-22 02:30:00 ... Practice 3 \n", "17 2023-10-06 16:30:00+03:00 2023-10-06 13:30:00 ... Sprint Shootout \n", "18 2023-10-20 12:30:00-05:00 2023-10-20 17:30:00 ... Sprint Shootout \n", "19 2023-10-27 12:30:00-06:00 2023-10-27 18:30:00 ... Practice 3 \n", "20 2023-11-03 11:30:00-03:00 2023-11-03 14:30:00 ... Sprint Shootout \n", "21 2023-11-16 20:30:00-08:00 2023-11-17 04:30:00 ... Practice 3 \n", "22 2023-11-24 13:30:00+04:00 2023-11-24 09:30:00 ... Practice 3 \n", "\n", " Session3Date Session3DateUtc Session4 \\\n", "0 2023-02-25 10:00:00+03:00 2023-02-25 07:00:00 None \n", "1 2023-03-04 14:30:00+03:00 2023-03-04 11:30:00 Qualifying \n", "2 2023-03-18 16:30:00+03:00 2023-03-18 13:30:00 Qualifying \n", "3 2023-04-01 12:30:00+10:00 2023-04-01 02:30:00 Qualifying \n", "4 2023-04-29 12:30:00+04:00 2023-04-29 08:30:00 Sprint \n", "5 2023-05-06 12:30:00-04:00 2023-05-06 16:30:00 Qualifying \n", "6 2023-05-27 12:30:00+02:00 2023-05-27 10:30:00 Qualifying \n", "7 2023-06-03 12:30:00+02:00 2023-06-03 10:30:00 Qualifying \n", "8 2023-06-17 12:30:00-04:00 2023-06-17 16:30:00 Qualifying \n", "9 2023-07-01 12:00:00+02:00 2023-07-01 10:00:00 Sprint \n", "10 2023-07-08 11:30:00+01:00 2023-07-08 10:30:00 Qualifying \n", "11 2023-07-22 12:30:00+02:00 2023-07-22 10:30:00 Qualifying \n", "12 2023-07-29 12:00:00+02:00 2023-07-29 10:00:00 Sprint \n", "13 2023-08-26 11:30:00+02:00 2023-08-26 09:30:00 Qualifying \n", "14 2023-09-02 12:30:00+02:00 2023-09-02 10:30:00 Qualifying \n", "15 2023-09-16 17:30:00+08:00 2023-09-16 09:30:00 Qualifying \n", "16 2023-09-23 11:30:00+09:00 2023-09-23 02:30:00 Qualifying \n", "17 2023-10-07 16:20:00+03:00 2023-10-07 13:20:00 Sprint \n", "18 2023-10-21 12:30:00-05:00 2023-10-21 17:30:00 Sprint \n", "19 2023-10-28 11:30:00-06:00 2023-10-28 17:30:00 Qualifying \n", "20 2023-11-04 11:00:00-03:00 2023-11-04 14:00:00 Sprint \n", "21 2023-11-17 20:30:00-08:00 2023-11-18 04:30:00 Qualifying \n", "22 2023-11-25 14:30:00+04:00 2023-11-25 10:30:00 Qualifying \n", "\n", " Session4Date Session4DateUtc Session5 \\\n", "0 NaT NaT None \n", "1 2023-03-04 18:00:00+03:00 2023-03-04 15:00:00 Race \n", "2 2023-03-18 20:00:00+03:00 2023-03-18 17:00:00 Race \n", "3 2023-04-01 16:00:00+10:00 2023-04-01 06:00:00 Race \n", "4 2023-04-29 17:30:00+04:00 2023-04-29 13:30:00 Race \n", "5 2023-05-06 16:00:00-04:00 2023-05-06 20:00:00 Race \n", "6 2023-05-27 16:00:00+02:00 2023-05-27 14:00:00 Race \n", "7 2023-06-03 16:00:00+02:00 2023-06-03 14:00:00 Race \n", "8 2023-06-17 16:00:00-04:00 2023-06-17 20:00:00 Race \n", "9 2023-07-01 16:30:00+02:00 2023-07-01 14:30:00 Race \n", "10 2023-07-08 15:00:00+01:00 2023-07-08 14:00:00 Race \n", "11 2023-07-22 16:00:00+02:00 2023-07-22 14:00:00 Race \n", "12 2023-07-29 17:05:00+02:00 2023-07-29 15:05:00 Race \n", "13 2023-08-26 15:00:00+02:00 2023-08-26 13:00:00 Race \n", "14 2023-09-02 16:00:00+02:00 2023-09-02 14:00:00 Race \n", "15 2023-09-16 21:00:00+08:00 2023-09-16 13:00:00 Race \n", "16 2023-09-23 15:00:00+09:00 2023-09-23 06:00:00 Race \n", "17 2023-10-07 20:30:00+03:00 2023-10-07 17:30:00 Race \n", "18 2023-10-21 17:00:00-05:00 2023-10-21 22:00:00 Race \n", "19 2023-10-28 15:00:00-06:00 2023-10-28 21:00:00 Race \n", "20 2023-11-04 15:30:00-03:00 2023-11-04 18:30:00 Race \n", "21 2023-11-18 00:00:00-08:00 2023-11-18 08:00:00 Race \n", "22 2023-11-25 18:00:00+04:00 2023-11-25 14:00:00 Race \n", "\n", " Session5Date Session5DateUtc F1ApiSupport \n", "0 NaT NaT True \n", "1 2023-03-05 18:00:00+03:00 2023-03-05 15:00:00 True \n", "2 2023-03-19 20:00:00+03:00 2023-03-19 17:00:00 True \n", "3 2023-04-02 15:00:00+10:00 2023-04-02 05:00:00 True \n", "4 2023-04-30 15:00:00+04:00 2023-04-30 11:00:00 True \n", "5 2023-05-07 15:30:00-04:00 2023-05-07 19:30:00 True \n", "6 2023-05-28 15:00:00+02:00 2023-05-28 13:00:00 True \n", "7 2023-06-04 15:00:00+02:00 2023-06-04 13:00:00 True \n", "8 2023-06-18 14:00:00-04:00 2023-06-18 18:00:00 True \n", "9 2023-07-02 15:00:00+02:00 2023-07-02 13:00:00 True \n", "10 2023-07-09 15:00:00+01:00 2023-07-09 14:00:00 True \n", "11 2023-07-23 15:00:00+02:00 2023-07-23 13:00:00 True \n", "12 2023-07-30 15:00:00+02:00 2023-07-30 13:00:00 True \n", "13 2023-08-27 15:00:00+02:00 2023-08-27 13:00:00 True \n", "14 2023-09-03 15:00:00+02:00 2023-09-03 13:00:00 True \n", "15 2023-09-17 20:00:00+08:00 2023-09-17 12:00:00 True \n", "16 2023-09-24 14:00:00+09:00 2023-09-24 05:00:00 True \n", "17 2023-10-08 20:00:00+03:00 2023-10-08 17:00:00 True \n", "18 2023-10-22 14:00:00-05:00 2023-10-22 19:00:00 True \n", "19 2023-10-29 14:00:00-06:00 2023-10-29 20:00:00 True \n", "20 2023-11-05 14:00:00-03:00 2023-11-05 17:00:00 True \n", "21 2023-11-18 22:00:00-08:00 2023-11-19 06:00:00 True \n", "22 2023-11-26 17:00:00+04:00 2023-11-26 13:00:00 True \n", "\n", "[23 rows x 23 columns]\n", " RoundNumber Country Location \\\n", "0 0 Bahrain Sakhir \n", "1 1 Bahrain Sakhir \n", "2 2 Saudi Arabia Jeddah \n", "3 3 Australia Melbourne \n", "4 4 Japan Suzuka \n", "5 5 China Shanghai \n", "6 6 United States Miami \n", "7 7 Italy Imola \n", "8 8 Monaco Monaco \n", "9 9 Canada Montréal \n", "10 10 Spain Barcelona \n", "11 11 Austria Spielberg \n", "12 12 United Kingdom Silverstone \n", "13 13 Hungary Budapest \n", "14 14 Belgium Spa-Francorchamps \n", "15 15 Netherlands Zandvoort \n", "16 16 Italy Monza \n", "17 17 Azerbaijan Baku \n", "18 18 Singapore Marina Bay \n", "19 19 United States Austin \n", "20 20 Mexico Mexico City \n", "21 21 Brazil São Paulo \n", "22 22 United States Las Vegas \n", "23 23 Qatar Lusail \n", "24 24 United Arab Emirates Yas Island \n", "\n", " OfficialEventName EventDate \\\n", "0 FORMULA 1 ARAMCO PRE-SEASON TESTING 2024 2024-02-23 \n", "1 FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2024 2024-03-02 \n", "2 FORMULA 1 STC SAUDI ARABIAN GRAND PRIX 2024 2024-03-09 \n", "3 FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2024 2024-03-24 \n", "4 FORMULA 1 MSC CRUISES JAPANESE GRAND PRIX 2024 2024-04-07 \n", "5 FORMULA 1 LENOVO CHINESE GRAND PRIX 2024 2024-04-21 \n", "6 FORMULA 1 CRYPTO.COM MIAMI GRAND PRIX 2024 2024-05-05 \n", "7 FORMULA 1 MSC CRUISES GRAN PREMIO DEL MADE IN ... 2024-05-19 \n", "8 FORMULA 1 GRAND PRIX DE MONACO 2024 2024-05-26 \n", "9 FORMULA 1 AWS GRAND PRIX DU CANADA 2024 2024-06-09 \n", "10 FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2024 2024-06-23 \n", "11 FORMULA 1 QATAR AIRWAYS AUSTRIAN GRAND PRIX 2024 2024-06-30 \n", "12 FORMULA 1 QATAR AIRWAYS BRITISH GRAND PRIX 2024 2024-07-07 \n", "13 FORMULA 1 HUNGARIAN GRAND PRIX 2024 2024-07-21 \n", "14 FORMULA 1 ROLEX BELGIAN GRAND PRIX 2024 2024-07-28 \n", "15 FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2024 2024-08-25 \n", "16 FORMULA 1 PIRELLI GRAN PREMIO D’ITALIA 2024 2024-09-01 \n", "17 FORMULA 1 QATAR AIRWAYS AZERBAIJAN GRAND PRIX ... 2024-09-15 \n", "18 FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND P... 2024-09-22 \n", "19 FORMULA 1 PIRELLI UNITED STATES GRAND PRIX 2024 2024-10-20 \n", "20 FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2024 2024-10-27 \n", "21 FORMULA 1 LENOVO GRANDE PRÊMIO DE SÃO PAULO 2024 2024-11-03 \n", "22 FORMULA 1 HEINEKEN SILVER LAS VEGAS GRAND PRIX... 2024-11-23 \n", "23 FORMULA 1 QATAR AIRWAYS QATAR GRAND PRIX 2024 2024-12-01 \n", "24 FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX ... 2024-12-08 \n", "\n", " EventName EventFormat Session1 \\\n", "0 Pre-Season Testing testing Practice 1 \n", "1 Bahrain Grand Prix conventional Practice 1 \n", "2 Saudi Arabian Grand Prix conventional Practice 1 \n", "3 Australian Grand Prix conventional Practice 1 \n", "4 Japanese Grand Prix conventional Practice 1 \n", "5 Chinese Grand Prix sprint_qualifying Practice 1 \n", "6 Miami Grand Prix sprint_qualifying Practice 1 \n", "7 Emilia Romagna Grand Prix conventional Practice 1 \n", "8 Monaco Grand Prix conventional Practice 1 \n", "9 Canadian Grand Prix conventional Practice 1 \n", "10 Spanish Grand Prix conventional Practice 1 \n", "11 Austrian Grand Prix sprint_qualifying Practice 1 \n", "12 British Grand Prix conventional Practice 1 \n", "13 Hungarian Grand Prix conventional Practice 1 \n", "14 Belgian Grand Prix conventional Practice 1 \n", "15 Dutch Grand Prix conventional Practice 1 \n", "16 Italian Grand Prix conventional Practice 1 \n", "17 Azerbaijan Grand Prix conventional Practice 1 \n", "18 Singapore Grand Prix conventional Practice 1 \n", "19 United States Grand Prix sprint_qualifying Practice 1 \n", "20 Mexico City Grand Prix conventional Practice 1 \n", "21 São Paulo Grand Prix sprint_qualifying Practice 1 \n", "22 Las Vegas Grand Prix conventional Practice 1 \n", "23 Qatar Grand Prix sprint_qualifying Practice 1 \n", "24 Abu Dhabi Grand Prix conventional Practice 1 \n", "\n", " Session1Date Session1DateUtc ... Session3 \\\n", "0 2024-02-21 10:00:00+03:00 2024-02-21 07:00:00 ... Practice 3 \n", "1 2024-02-29 14:30:00+03:00 2024-02-29 11:30:00 ... Practice 3 \n", "2 2024-03-07 16:30:00+03:00 2024-03-07 13:30:00 ... Practice 3 \n", "3 2024-03-22 12:30:00+11:00 2024-03-22 01:30:00 ... Practice 3 \n", "4 2024-04-05 11:30:00+09:00 2024-04-05 02:30:00 ... Practice 3 \n", "5 2024-04-19 11:30:00+08:00 2024-04-19 03:30:00 ... Sprint \n", "6 2024-05-03 12:30:00-04:00 2024-05-03 16:30:00 ... Sprint \n", "7 2024-05-17 13:30:00+02:00 2024-05-17 11:30:00 ... Practice 3 \n", "8 2024-05-24 13:30:00+02:00 2024-05-24 11:30:00 ... Practice 3 \n", "9 2024-06-07 13:30:00-04:00 2024-06-07 17:30:00 ... Practice 3 \n", "10 2024-06-21 13:30:00+02:00 2024-06-21 11:30:00 ... Practice 3 \n", "11 2024-06-28 12:30:00+02:00 2024-06-28 10:30:00 ... Sprint \n", "12 2024-07-05 12:30:00+01:00 2024-07-05 11:30:00 ... Practice 3 \n", "13 2024-07-19 13:30:00+02:00 2024-07-19 11:30:00 ... Practice 3 \n", "14 2024-07-26 13:30:00+02:00 2024-07-26 11:30:00 ... Practice 3 \n", "15 2024-08-23 12:30:00+02:00 2024-08-23 10:30:00 ... Practice 3 \n", "16 2024-08-30 13:30:00+02:00 2024-08-30 11:30:00 ... Practice 3 \n", "17 2024-09-13 13:30:00+04:00 2024-09-13 09:30:00 ... Practice 3 \n", "18 2024-09-20 17:30:00+08:00 2024-09-20 09:30:00 ... Practice 3 \n", "19 2024-10-18 12:30:00-05:00 2024-10-18 17:30:00 ... Sprint \n", "20 2024-10-25 12:30:00-06:00 2024-10-25 18:30:00 ... Practice 3 \n", "21 2024-11-01 11:30:00-03:00 2024-11-01 14:30:00 ... Sprint \n", "22 2024-11-21 18:30:00-08:00 2024-11-22 02:30:00 ... Practice 3 \n", "23 2024-11-29 16:30:00+03:00 2024-11-29 13:30:00 ... Sprint \n", "24 2024-12-06 13:30:00+04:00 2024-12-06 09:30:00 ... Practice 3 \n", "\n", " Session3Date Session3DateUtc Session4 \\\n", "0 2024-02-23 10:00:00+03:00 2024-02-23 07:00:00 None \n", "1 2024-03-01 15:30:00+03:00 2024-03-01 12:30:00 Qualifying \n", "2 2024-03-08 16:30:00+03:00 2024-03-08 13:30:00 Qualifying \n", "3 2024-03-23 12:30:00+11:00 2024-03-23 01:30:00 Qualifying \n", "4 2024-04-06 11:30:00+09:00 2024-04-06 02:30:00 Qualifying \n", "5 2024-04-20 11:00:00+08:00 2024-04-20 03:00:00 Qualifying \n", "6 2024-05-04 12:00:00-04:00 2024-05-04 16:00:00 Qualifying \n", "7 2024-05-18 12:30:00+02:00 2024-05-18 10:30:00 Qualifying \n", "8 2024-05-25 12:30:00+02:00 2024-05-25 10:30:00 Qualifying \n", "9 2024-06-08 12:30:00-04:00 2024-06-08 16:30:00 Qualifying \n", "10 2024-06-22 12:30:00+02:00 2024-06-22 10:30:00 Qualifying \n", "11 2024-06-29 12:00:00+02:00 2024-06-29 10:00:00 Qualifying \n", "12 2024-07-06 11:30:00+01:00 2024-07-06 10:30:00 Qualifying \n", "13 2024-07-20 12:30:00+02:00 2024-07-20 10:30:00 Qualifying \n", "14 2024-07-27 12:30:00+02:00 2024-07-27 10:30:00 Qualifying \n", "15 2024-08-24 11:30:00+02:00 2024-08-24 09:30:00 Qualifying \n", "16 2024-08-31 12:30:00+02:00 2024-08-31 10:30:00 Qualifying \n", "17 2024-09-14 12:30:00+04:00 2024-09-14 08:30:00 Qualifying \n", "18 2024-09-21 17:30:00+08:00 2024-09-21 09:30:00 Qualifying \n", "19 2024-10-19 13:00:00-05:00 2024-10-19 18:00:00 Qualifying \n", "20 2024-10-26 11:30:00-06:00 2024-10-26 17:30:00 Qualifying \n", "21 2024-11-02 11:00:00-03:00 2024-11-02 14:00:00 Qualifying \n", "22 2024-11-22 18:30:00-08:00 2024-11-23 02:30:00 Qualifying \n", "23 2024-11-30 17:00:00+03:00 2024-11-30 14:00:00 Qualifying \n", "24 2024-12-07 14:30:00+04:00 2024-12-07 10:30:00 Qualifying \n", "\n", " Session4Date Session4DateUtc Session5 \\\n", "0 NaT NaT None \n", "1 2024-03-01 19:00:00+03:00 2024-03-01 16:00:00 Race \n", "2 2024-03-08 20:00:00+03:00 2024-03-08 17:00:00 Race \n", "3 2024-03-23 16:00:00+11:00 2024-03-23 05:00:00 Race \n", "4 2024-04-06 15:00:00+09:00 2024-04-06 06:00:00 Race \n", "5 2024-04-20 15:00:00+08:00 2024-04-20 07:00:00 Race \n", "6 2024-05-04 16:00:00-04:00 2024-05-04 20:00:00 Race \n", "7 2024-05-18 16:00:00+02:00 2024-05-18 14:00:00 Race \n", "8 2024-05-25 16:00:00+02:00 2024-05-25 14:00:00 Race \n", "9 2024-06-08 16:00:00-04:00 2024-06-08 20:00:00 Race \n", "10 2024-06-22 16:00:00+02:00 2024-06-22 14:00:00 Race \n", "11 2024-06-29 16:00:00+02:00 2024-06-29 14:00:00 Race \n", "12 2024-07-06 15:00:00+01:00 2024-07-06 14:00:00 Race \n", "13 2024-07-20 16:00:00+02:00 2024-07-20 14:00:00 Race \n", "14 2024-07-27 16:00:00+02:00 2024-07-27 14:00:00 Race \n", "15 2024-08-24 15:00:00+02:00 2024-08-24 13:00:00 Race \n", "16 2024-08-31 16:00:00+02:00 2024-08-31 14:00:00 Race \n", "17 2024-09-14 16:00:00+04:00 2024-09-14 12:00:00 Race \n", "18 2024-09-21 21:00:00+08:00 2024-09-21 13:00:00 Race \n", "19 2024-10-19 17:00:00-05:00 2024-10-19 22:00:00 Race \n", "20 2024-10-26 15:00:00-06:00 2024-10-26 21:00:00 Race \n", "21 2024-11-03 07:30:00-03:00 2024-11-03 10:30:00 Race \n", "22 2024-11-22 22:00:00-08:00 2024-11-23 06:00:00 Race \n", "23 2024-11-30 21:00:00+03:00 2024-11-30 18:00:00 Race \n", "24 2024-12-07 18:00:00+04:00 2024-12-07 14:00:00 Race \n", "\n", " Session5Date Session5DateUtc F1ApiSupport \n", "0 NaT NaT True \n", "1 2024-03-02 18:00:00+03:00 2024-03-02 15:00:00 True \n", "2 2024-03-09 20:00:00+03:00 2024-03-09 17:00:00 True \n", "3 2024-03-24 15:00:00+11:00 2024-03-24 04:00:00 True \n", "4 2024-04-07 14:00:00+09:00 2024-04-07 05:00:00 True \n", "5 2024-04-21 15:00:00+08:00 2024-04-21 07:00:00 True \n", "6 2024-05-05 16:00:00-04:00 2024-05-05 20:00:00 True \n", "7 2024-05-19 15:00:00+02:00 2024-05-19 13:00:00 True \n", "8 2024-05-26 15:00:00+02:00 2024-05-26 13:00:00 True \n", "9 2024-06-09 14:00:00-04:00 2024-06-09 18:00:00 True \n", "10 2024-06-23 15:00:00+02:00 2024-06-23 13:00:00 True \n", "11 2024-06-30 15:00:00+02:00 2024-06-30 13:00:00 True \n", "12 2024-07-07 15:00:00+01:00 2024-07-07 14:00:00 True \n", "13 2024-07-21 15:00:00+02:00 2024-07-21 13:00:00 True \n", "14 2024-07-28 15:00:00+02:00 2024-07-28 13:00:00 True \n", "15 2024-08-25 15:00:00+02:00 2024-08-25 13:00:00 True \n", "16 2024-09-01 15:00:00+02:00 2024-09-01 13:00:00 True \n", "17 2024-09-15 15:00:00+04:00 2024-09-15 11:00:00 True \n", "18 2024-09-22 20:00:00+08:00 2024-09-22 12:00:00 True \n", "19 2024-10-20 14:00:00-05:00 2024-10-20 19:00:00 True \n", "20 2024-10-27 14:00:00-06:00 2024-10-27 20:00:00 True \n", "21 2024-11-03 12:30:00-03:00 2024-11-03 15:30:00 True \n", "22 2024-11-23 22:00:00-08:00 2024-11-24 06:00:00 True \n", "23 2024-12-01 19:00:00+03:00 2024-12-01 16:00:00 True \n", "24 2024-12-08 17:00:00+04:00 2024-12-08 13:00:00 True \n", "\n", "[25 rows x 23 columns]\n", "{'Session2DateUtc', 'Session2Date', 'OfficialEventName', 'Session2', 'EventName', 'RoundNumber', 'Session5DateUtc', 'Session1DateUtc', 'Session4DateUtc', 'Session4Date', 'Session3Date', 'F1ApiSupport', 'Country', 'Session5', 'EventFormat', 'Session3DateUtc', 'Session4', 'Session1', 'Session1Date', 'Session3', 'EventDate', 'Location', 'Session5Date'}\n" ] } ], "source": [ "# FastF1 general setup\n", "cache_dir = '../data/cache'\n", "if not os.path.exists(cache_dir):\n", " os.makedirs(cache_dir)\n", "\n", "fastf1.Cache.enable_cache(cache_dir)\n", "fastf1.plotting.setup_mpl(misc_mpl_mods=False, color_scheme=None)\n", "logging.disable(logging.INFO)\n", "warnings.filterwarnings('ignore', category=UserWarning)\n", "\n", "# Set up plot style\n", "# print style.available to check available styles\n", "plt.style.use('seaborn-v0_8-whitegrid')\n", "\n", "# LIST ALL EVENTS\n", "print(fastf1.get_event_schedule(2021))\n", "print(fastf1.get_event_schedule(2022))\n", "print(fastf1.get_event_schedule(2023))\n", "print(fastf1.get_event_schedule(2024))\n", "\n", "# Find common events\n", "common_events = set(fastf1.get_event_schedule(2021)) & set(fastf1.get_event_schedule(2022)) & set(fastf1.get_event_schedule(2023)) & set(fastf1.get_event_schedule(2024))\n", "print(common_events)\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Define years, sessions, and events of interest\n", "years = [2021, 2022, 2023, 2024]\n", "sessions = ['Race']\n", "events = ['Bahrain Grand Prix', 'Saudi Arabian Grand Prix', 'Dutch Grand Prix', 'Italian Grand Prix', 'Austrian Grand Prix', 'Hungarian Grand Prix', 'British Grand Prix', 'Belgian Grand Prix', 'United States Grand Prix', 'Mexico City Grand Prix', 'Sao Paulo Grand Prix']" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Processing 2021 Bahrain Grand Prix - Race\n", "Processing 2021 Saudi Arabian Grand Prix - Race\n", "Processing 2021 Dutch Grand Prix - Race\n", "Processing 2021 Italian Grand Prix - Race\n", "Processing 2021 Austrian Grand Prix - Race\n", "Processing 2021 Hungarian Grand Prix - Race\n", "Processing 2021 British Grand Prix - Race\n", "Processing 2021 Belgian Grand Prix - Race\n", "Processing 2021 United States Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "core WARNING \tDriver 7: Lap timing integrity check failed for 1 lap(s)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2021 Mexico City Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'Sao Paulo Grand Prix' to 'São Paulo Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2021 Sao Paulo Grand Prix - Race\n", "Processing 2022 Bahrain Grand Prix - Race\n", "Processing 2022 Saudi Arabian Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "core WARNING \tNo lap data for driver 22\n", "core WARNING \tNo lap data for driver 47\n", "core WARNING \tFailed to perform lap accuracy check - all laps marked as inaccurate (driver 22)\n", "core WARNING \tFailed to perform lap accuracy check - all laps marked as inaccurate (driver 47)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2022 Dutch Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "_api WARNING \tDriver 241: Position data is incomplete!\n", "_api WARNING \tDriver 242: Position data is incomplete!\n", "_api WARNING \tDriver 243: Position data is incomplete!\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2022 Italian Grand Prix - Race\n", "Processing 2022 Austrian Grand Prix - Race\n", "Processing 2022 Hungarian Grand Prix - Race\n", "Processing 2022 British Grand Prix - Race\n", "Processing 2022 Belgian Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'United States Grand Prix' to 'United States Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2022 United States Grand Prix - Race\n", "Processing 2022 Mexico City Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'Sao Paulo Grand Prix' to 'São Paulo Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2022 Sao Paulo Grand Prix - Race\n", "Processing 2023 Bahrain Grand Prix - Race\n", "Processing 2023 Saudi Arabian Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "_api WARNING \tDriver 241: Position data is incomplete!\n", "_api WARNING \tDriver 242: Position data is incomplete!\n", "_api WARNING \tDriver 243: Position data is incomplete!\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2023 Dutch Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "_api WARNING \tDriver 241: Position data is incomplete!\n", "_api WARNING \tDriver 242: Position data is incomplete!\n", "_api WARNING \tDriver 243: Position data is incomplete!\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2023 Italian Grand Prix - Race\n", "Processing 2023 Austrian Grand Prix - Race\n", "Processing 2023 Hungarian Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "_api WARNING \tSkipping lap alignment (no suitable lap)!\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2023 British Grand Prix - Race\n", "Processing 2023 Belgian Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'United States Grand Prix' to 'United States Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2023 United States Grand Prix - Race\n", "Processing 2023 Mexico City Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'Sao Paulo Grand Prix' to 'São Paulo Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2023 Sao Paulo Grand Prix - Race\n", "Processing 2024 Bahrain Grand Prix - Race\n", "Processing 2024 Saudi Arabian Grand Prix - Race\n", "Processing 2024 Dutch Grand Prix - Race\n", "Processing 2024 Italian Grand Prix - Race\n", "Processing 2024 Austrian Grand Prix - Race\n", "Processing 2024 Hungarian Grand Prix - Race\n", "Processing 2024 British Grand Prix - Race\n", "Processing 2024 Belgian Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'United States Grand Prix' to 'United States Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2024 United States Grand Prix - Race\n", "Processing 2024 Mexico City Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "events WARNING \tCorrecting user input 'Sao Paulo Grand Prix' to 'São Paulo Grand Prix'\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Processing 2024 Sao Paulo Grand Prix - Race\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "core WARNING \tNo lap data for driver 23\n", "core WARNING \tFailed to perform lap accuracy check - all laps marked as inaccurate (driver 23)\n" ] } ], "source": [ "# Get data from FastF1 API\n", "\n", "# Data containers\n", "weather_data_list = []\n", "lap_data_list = []\n", "\n", "# Loop through years and sessions\n", "for year in years:\n", " for event_name in events: \n", " for session_name in sessions:\n", " try:\n", " print(f\"Processing {year} {event_name} - {session_name}\")\n", " \n", " # Load the session\n", " session = fastf1.get_session(year, event_name, session_name, backend='fastf1')\n", " session.load()\n", " \n", " # Process weather data\n", " weather_data = session.weather_data\n", " if weather_data is not None:\n", " weather_df = pd.DataFrame(weather_data)\n", " # Add context columns\n", " weather_df['Year'] = year\n", " weather_df['Event'] = event_name\n", " weather_df['Session'] = session_name\n", " weather_data_list.append(weather_df)\n", "\n", " # Process lap data\n", " lap_data = session.laps\n", " if lap_data is not None:\n", " lap_df = pd.DataFrame(lap_data)\n", " # Add context columns\n", " lap_df['Year'] = year\n", " lap_df['Event'] = event_name\n", " lap_df['Session'] = session_name\n", " # Ensure driver information is included\n", " if 'Driver' not in lap_df.columns:\n", " lap_df['Driver'] = lap_df['DriverNumber'].map(session.drivers)\n", " # Add team information if available\n", " if 'Team' not in lap_df.columns:\n", " lap_df['Team'] = lap_df['Driver'].map(session.drivers_info['TeamName'])\n", " lap_data_list.append(lap_df)\n", " \n", " except Exception as e:\n", " print(f\"Error with {event_name} {session_name} ({year}): {e}\")\n", "\n", "# Combine data into DataFrames\n", "if weather_data_list:\n", " weather_data_combined = pd.concat(weather_data_list, ignore_index=True)\n", " # Ensure consistent column ordering\n", " weather_cols = ['Time', 'Year', 'Event', 'Session', \n", " 'AirTemp', 'Humidity', 'Pressure', 'Rainfall', \n", " 'TrackTemp', 'WindDirection', 'WindSpeed']\n", " weather_data_combined = weather_data_combined[weather_cols]\n", " \n", "if lap_data_list:\n", " lap_data_combined = pd.concat(lap_data_list, ignore_index=True)\n", " # Ensure consistent column ordering\n", " lap_cols = ['Time', 'Year', 'Event', 'Session', \n", " 'Driver', 'Team', 'LapNumber', 'LapTime',\n", " 'Sector1Time', 'Sector2Time', 'Sector3Time',\n", " 'Compound', 'TyreLife', 'FreshTyre',\n", " 'SpeedI1', 'SpeedI2', 'SpeedFL', 'SpeedST']\n", " # Only include columns that exist\n", " existing_cols = [col for col in lap_cols if col in lap_data_combined.columns]\n", " lap_data_combined = lap_data_combined[existing_cols]\n", " \n", "# Time conversion\n", "# Function to convert timedelta to datetime\n", "def convert_timedelta_to_datetime(df, base_date='2021-01-01'):\n", " if 'Time' in df.columns:\n", " # Create a base datetime and add the timedelta\n", " base = pd.Timestamp(base_date)\n", " if df['Time'].dtype == 'timedelta64[ns]':\n", " df['Time'] = base + df['Time']\n", " return df\n", "\n", "# Apply conversion to both dataframes\n", "weather_data_combined = convert_timedelta_to_datetime(weather_data_combined)\n", "lap_data_combined = convert_timedelta_to_datetime(lap_data_combined)\n", "\n", "# Remove missing values\n", "weather_data_combined = weather_data_combined.dropna()\n", "lap_data_combined = lap_data_combined.dropna()\n", "\n", "# Create a new column for lap time in seconds\n", "lap_data_combined['LapTime_seconds'] = lap_data_combined['LapTime'].dt.total_seconds()\n", "\n", "# Merge the data\n", "merged_data = pd.merge_asof(\n", " lap_data_combined.sort_values('Time'),\n", " weather_data_combined.sort_values('Time'),\n", " on='Time',\n", " by=['Event', 'Year'], # Match within same event and year\n", " direction='nearest',\n", " tolerance=pd.Timedelta('1 min') # Allow matching within 1 minute\n", ")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def engineer_features(df):\n", " \"\"\"\n", " Engineer features for F1 lap time prediction with enhanced track-specific optimizations.\n", " \n", " Parameters:\n", " df (pandas.DataFrame): Input dataframe containing raw F1 session data\n", " Returns:\n", " pandas.DataFrame: DataFrame with engineered features\n", " \"\"\"\n", " # Basic weather and track condition features\n", " df['GripCondition'] = df.apply(lambda x: \n", " x['TrackTemp'] * (1 - x['Humidity']/150) * (1 - abs(x['WindSpeed'])/50) if 'British' in x['Event']\n", " else x['TrackTemp'] * (1 - x['Humidity']/100), axis=1)\n", " \n", " df['TempDelta'] = df['TrackTemp'] - df['AirTemp']\n", " \n", " # Enhanced tire degradation with weather impact\n", " df['TyreDeg'] = df.apply(lambda x: \n", " np.exp(-0.025 * x['TyreLife']) * (1 - x['Humidity']/200) if 'British' in x['Event']\n", " else np.exp(-0.025 * x['TyreLife']) if 'Belgian' in x['Event']\n", " else np.exp(-0.015 * x['TyreLife']), axis=1)\n", " \n", " # Track evolution with enhanced weather adjustment\n", " df['TrackEvolution'] = df.apply(lambda x: \n", " (1 - np.exp(-0.15 * x['LapNumber'])) * (1 - x['Humidity']/250) * (1 - abs(x['WindSpeed'])/40) if 'British' in x['Event']\n", " else (1 - np.exp(-0.15 * x['LapNumber'])) if 'United States' in x['Event']\n", " else 1 - np.exp(-0.1 * x['LapNumber']), axis=1)\n", " \n", " # Temperature interactions\n", " df['TempInteraction'] = df['TrackTemp'] * df['AirTemp']\n", " df['TempInteractionSquared'] = df['TempInteraction'] ** 2\n", " \n", " # Enhanced weather complexity\n", " df['WeatherComplexity'] = df.apply(lambda x:\n", " (x['WindSpeed'] * 0.4 + abs(x['TempDelta']) * 0.4 + x['Humidity'] * 0.2) / 100.0 if 'British' in x['Event']\n", " else (x['WindSpeed'] * 0.3 + abs(x['TempDelta']) * 0.4 + x['Humidity'] * 0.3) / 100.0 if 'Belgian' in x['Event']\n", " else (x['WindSpeed'] * 0.2 + abs(x['TempDelta']) * 0.5 + x['Humidity'] * 0.3) / 100.0,\n", " axis=1)\n", " \n", " # Track-specific features\n", " df['DesertEffect'] = np.where(\n", " df['Event'].str.contains('Bahrain'),\n", " df['WindSpeed'] * df['Humidity'] * df['TempInteraction'] / 10000,\n", " 0\n", " )\n", " \n", " # Enhanced wet weather effect\n", " df['WetWeatherEffect'] = df.apply(lambda x:\n", " (x['Humidity'] * x['WindSpeed'] * abs(x['TempDelta'])) / 800 if 'British' in x['Event']\n", " else (x['Humidity'] * x['WindSpeed'] * abs(x['TempDelta'])) / 1000 if 'Belgian' in x['Event']\n", " else 0, axis=1)\n", " \n", " df['AltitudeEffect'] = np.where(\n", " df['Event'].str.contains('Mexico City'),\n", " df['AirTemp'] * (1 - df['Humidity']/200) * df['WindSpeed'] / 10,\n", " 0\n", " )\n", " \n", " df['WeatherStability'] = df.apply(lambda x:\n", " 1 - (abs(x['WindSpeed']) + abs(x['TempDelta']) + x['Humidity'])/300 if 'British' in x['Event']\n", " else 1, axis=1)\n", " \n", " df['TrackCondition'] = df.apply(lambda x:\n", " (x['TrackTemp'] * x['WeatherStability'] * (1 - x['WetWeatherEffect'])) if 'British' in x['Event']\n", " else x['TrackTemp'], axis=1)\n", " \n", " # Rolling averages for weather stability (3-lap window)\n", " df['WindSpeed_Rolling'] = df.groupby('Event')['WindSpeed'].transform(lambda x: x.rolling(3, min_periods=1).mean())\n", " df['Humidity_Rolling'] = df.groupby('Event')['Humidity'].transform(lambda x: x.rolling(3, min_periods=1).mean())\n", " df['TrackTemp_Rolling'] = df.groupby('Event')['TrackTemp'].transform(lambda x: x.rolling(3, min_periods=1).mean())\n", " \n", " # Weather change indicators\n", " df['WeatherChangeRate'] = df.apply(lambda x:\n", " abs(x['WindSpeed'] - x['WindSpeed_Rolling']) + \n", " abs(x['Humidity'] - x['Humidity_Rolling']) + \n", " abs(x['TrackTemp'] - x['TrackTemp_Rolling']) if 'British' in x['Event']\n", " else 0, axis=1)\n", " \n", " return df" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def prepare_modeling_data(df):\n", " \"\"\"\n", " Prepare data for modeling with optimized track-specific configurations.\n", " \"\"\"\n", " data = engineer_features(df)\n", " track_results = {}\n", " \n", " base_features = [\n", " 'TrackTemp', 'AirTemp', 'Humidity', 'WindSpeed',\n", " 'TyreLife', 'TyreDeg', 'TempDelta', 'GripCondition',\n", " 'TrackEvolution', 'TempInteraction', 'TempInteractionSquared',\n", " 'WeatherComplexity', 'DesertEffect', 'WetWeatherEffect', 'AltitudeEffect',\n", " 'WeatherStability', 'TrackCondition', 'WeatherChangeRate',\n", " 'WindSpeed_Rolling', 'Humidity_Rolling', 'TrackTemp_Rolling'\n", " ]\n", " \n", " track_configs = {\n", " 'default': {\n", " 'n_estimators': 300,\n", " 'max_depth': 7,\n", " 'learning_rate': 0.005,\n", " 'min_child_samples': 25,\n", " 'subsample': 0.8,\n", " 'colsample_bytree': 0.8,\n", " 'reg_alpha': 0.2,\n", " 'reg_lambda': 1.5,\n", " 'num_leaves': 35,\n", " 'feature_fraction': 0.8,\n", " 'bagging_fraction': 0.8,\n", " 'bagging_freq': 5\n", " },\n", " # 'British': {\n", " # 'n_estimators': 500,\n", " # 'max_depth': 8,\n", " # 'learning_rate': 0.002,\n", " # 'min_child_samples': 30,\n", " # 'subsample': 0.75,\n", " # 'colsample_bytree': 0.75,\n", " # 'reg_alpha': 0.3,\n", " # 'reg_lambda': 2.0,\n", " # 'num_leaves': 30,\n", " # 'feature_fraction': 0.7,\n", " # 'bagging_fraction': 0.7,\n", " # 'bagging_freq': 7\n", " # },\n", " # 'Bahrain': {\n", " # 'n_estimators': 400,\n", " # 'max_depth': 8,\n", " # 'learning_rate': 0.003,\n", " # 'min_child_samples': 25,\n", " # 'subsample': 0.85,\n", " # 'colsample_bytree': 0.85,\n", " # 'reg_alpha': 0.2,\n", " # 'reg_lambda': 1.5,\n", " # 'num_leaves': 40,\n", " # 'feature_fraction': 0.8,\n", " # 'bagging_fraction': 0.8,\n", " # 'bagging_freq': 5\n", " # },\n", " # 'Belgian': {\n", " # 'n_estimators': 350,\n", " # 'max_depth': 7,\n", " # 'learning_rate': 0.004,\n", " # 'min_child_samples': 20,\n", " # 'subsample': 0.8,\n", " # 'colsample_bytree': 0.8,\n", " # 'reg_alpha': 0.15,\n", " # 'reg_lambda': 1.2,\n", " # 'num_leaves': 35,\n", " # 'feature_fraction': 0.85,\n", " # 'bagging_fraction': 0.85,\n", " # 'bagging_freq': 4\n", " # },\n", " # 'Mexico': {\n", " # 'n_estimators': 400,\n", " # 'max_depth': 8,\n", " # 'learning_rate': 0.003,\n", " # 'min_child_samples': 25,\n", " # 'subsample': 0.8,\n", " # 'colsample_bytree': 0.8,\n", " # 'reg_alpha': 0.25,\n", " # 'reg_lambda': 1.8,\n", " # 'num_leaves': 45,\n", " # 'feature_fraction': 0.75,\n", " # 'bagging_fraction': 0.75,\n", " # 'bagging_freq': 6\n", " # },\n", " # 'United': {\n", " # 'n_estimators': 350,\n", " # 'max_depth': 7,\n", " # 'learning_rate': 0.004,\n", " # 'min_child_samples': 20,\n", " # 'subsample': 0.8,\n", " # 'colsample_bytree': 0.8,\n", " # 'reg_alpha': 0.2,\n", " # 'reg_lambda': 1.5,\n", " # 'num_leaves': 38,\n", " # 'feature_fraction': 0.8,\n", " # 'bagging_fraction': 0.8,\n", " # 'bagging_freq': 5\n", " # }\n", " }\n", " \n", " for event in df['Event'].unique():\n", " event_data = data[data['Event'] == event].copy()\n", " config = track_configs.get(event.split()[0], track_configs['default'])\n", " \n", " X = event_data[base_features]\n", " y = event_data['LapTime_seconds']\n", " \n", " mask = ~y.isna()\n", " X = X[mask]\n", " y = y[mask]\n", " \n", " X_train, X_test, y_train, y_test = train_test_split(\n", " X, y, test_size=0.2, random_state=42\n", " )\n", " \n", " preprocessor = Pipeline([\n", " ('imputer', SimpleImputer(strategy='median')),\n", " ('scaler', StandardScaler())\n", " ])\n", " \n", " X_train_processed = preprocessor.fit_transform(X_train)\n", " X_test_processed = preprocessor.transform(X_test)\n", " \n", " models = {\n", " 'Random Forest': RandomForestRegressor(\n", " n_estimators=config['n_estimators'],\n", " max_depth=config['max_depth'],\n", " min_samples_leaf=config['min_child_samples'],\n", " max_features='sqrt',\n", " random_state=42\n", " ),\n", " 'XGBoost': XGBRegressor(\n", " n_estimators=config['n_estimators'],\n", " max_depth=config['max_depth'],\n", " learning_rate=config['learning_rate'],\n", " min_child_weight=config['min_child_samples'],\n", " subsample=config['subsample'],\n", " colsample_bytree=config['colsample_bytree'],\n", " reg_alpha=config['reg_alpha'],\n", " reg_lambda=config['reg_lambda'],\n", " random_state=42\n", " ),\n", " 'LightGBM': LGBMRegressor(\n", " n_estimators=config['n_estimators'],\n", " max_depth=config['max_depth'],\n", " learning_rate=config['learning_rate'],\n", " min_child_samples=config['min_child_samples'],\n", " subsample=config.get('bagging_fraction', 0.8),\n", " colsample_bytree=config.get('feature_fraction', 0.8),\n", " num_leaves=config.get('num_leaves', 31),\n", " bagging_freq=config.get('bagging_freq', 5),\n", " reg_alpha=config['reg_alpha'],\n", " reg_lambda=config['reg_lambda'],\n", " random_state=42,\n", " verbose=-1,\n", " min_data_in_leaf=1\n", " ),\n", " 'Gradient Boosting': GradientBoostingRegressor(\n", " n_estimators=config['n_estimators'],\n", " max_depth=config['max_depth'],\n", " learning_rate=config['learning_rate'],\n", " min_samples_leaf=config['min_child_samples'],\n", " subsample=config['subsample'],\n", " max_features=config['colsample_bytree'],\n", " random_state=42\n", " )\n", " }\n", " \n", " track_results[event] = {}\n", " \n", " for name, model in models.items():\n", " model.fit(X_train_processed, y_train)\n", " y_pred = model.predict(X_test_processed)\n", " \n", " mse = mean_squared_error(y_test, y_pred)\n", " rmse = np.sqrt(mse)\n", " r2 = r2_score(y_test, y_pred)\n", " mae = mean_absolute_error(y_test, y_pred)\n", " \n", " track_results[event][name] = {\n", " 'rmse': rmse,\n", " 'r2': r2,\n", " 'mae': mae\n", " }\n", " \n", " return track_results" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def plot_model_performance(track_results):\n", " \"\"\"\n", " Plot performance metrics for all models across different tracks.\n", " \"\"\"\n", " comparison_data = []\n", " \n", " # Prepare data for plotting\n", " for track, models in track_results.items():\n", " for model_name, metrics in models.items():\n", " comparison_data.append({\n", " 'Track': track.replace(' Grand Prix', ''),\n", " 'Model': model_name,\n", " 'RMSE': metrics['rmse'],\n", " 'R²': metrics['r2'],\n", " 'MAE': metrics['mae']\n", " })\n", " \n", " comparison_df = pd.DataFrame(comparison_data)\n", " \n", " fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(15, 18))\n", " \n", " sns.barplot(data=comparison_df, x='Track', y='RMSE', hue='Model', ax=ax1)\n", " ax1.set_title('Root Mean Square Error by Track and Model')\n", " ax1.set_xticklabels(ax1.get_xticklabels(), rotation=45)\n", " \n", " sns.barplot(data=comparison_df, x='Track', y='R²', hue='Model', ax=ax2)\n", " ax2.set_title('R² Score by Track and Model')\n", " ax2.set_xticklabels(ax2.get_xticklabels(), rotation=45)\n", " \n", " sns.barplot(data=comparison_df, x='Track', y='MAE', hue='Model', ax=ax3)\n", " ax3.set_title('Mean Absolute Error by Track and Model')\n", " ax3.set_xticklabels(ax3.get_xticklabels(), rotation=45)\n", " \n", " plt.tight_layout()\n", " plt.show()\n", " \n", " print(\"\\nAverage Metrics Across All Tracks:\")\n", " mean_metrics = comparison_df.groupby('Model').agg({\n", " 'RMSE': 'mean',\n", " 'R²': 'mean',\n", " 'MAE': 'mean'\n", " })\n", " print(mean_metrics.round(3))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Average Metrics Across All Tracks:\n", " RMSE R² MAE\n", "Model \n", "Gradient Boosting 4.259 0.726 2.083\n", "LightGBM 3.663 0.806 1.839\n", "Random Forest 4.915 0.644 2.396\n", "XGBoost 4.333 0.717 2.122\n" ] } ], "source": [ "# Execute modeling pipeline\n", "track_results = prepare_modeling_data(merged_data)\n", "\n", "# Visualize results\n", "plot_model_performance(track_results)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Key Findings\n", "\n", "1. **Track-Specific Performance**:\n", " - Best performance achieved on Belgian GP with Random Forest (R² = 0.775)\n", " - Most challenging predictions for British GP (best R² = 0.047)\n", " - Weather conditions appear to have strongest influence at Belgian GP\n", "\n", "2. **Model Comparison**:\n", " - Random Forest consistently performs best across tracks\n", " - XGBoost shows high variance in performance\n", " - Gradient Boosting provides most stable results\n", "\n", "3. **Important Features**:\n", " - Track temperature and air temperature interaction\n", " - Track evolution throughout race\n", " - Weather complexity score\n", " - Tire degradation metrics\n", "\n", "## Next Steps for Improvement\n", "\n", "1. **Feature Engineering**:\n", " - Create more sophisticated tire degradation models\n", " - Incorporate historical track performance data\n", " - Develop track-specific feature sets\n", "\n", "2. **Model Optimization**:\n", " - Implement GridSearchCV for hyperparameter tuning\n", " - Test neural network approaches\n", " - Create track-specific model ensembles\n", "\n", "3. **Analysis Refinement**:\n", " - Investigate poor performance on British GP\n", " - Analyze weather condition thresholds\n", " - Study interaction effects between features" ] } ], "metadata": { "kernelspec": { "display_name": "csci349", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 2 }