mirror of
https://github.com/soconnor0919/f1-race-prediction.git
synced 2026-02-05 08:16:36 -05:00
1577 lines
196 KiB
Plaintext
1577 lines
196 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Formula One Project: Data Preparation and EDA\n",
|
||
"\n",
|
||
"DUE: November 22nd, 2024 (Fri) \n",
|
||
"Name(s): Sean O'Connor, Connor Coles \n",
|
||
"Class: CSCI 349 - Intro to Data Mining \n",
|
||
"Semester: Fall 2024 \n",
|
||
"Instructor: Brian King "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Assignment Description\n",
|
||
"Create your first notebook file, DataPrep_EDA.ipynb. Use both markdown and code cells to convey the following:\n",
|
||
"- What problem are you working on? Summarize in a single cell.\n",
|
||
"- What data are you using to understand the problem? Describe the data in a very general sense. Where did it come from? You should understand what every observation in the data represents, and what each variable represents.\n",
|
||
"- Remember that the key to achieving good machine learning outcomes is understanding how each real-world entity in your data will be represented as a fixed length vector of attributes in your dataset! Preprocessing your data will be a big part of this challenge. If you do not expect to spend quality time cleaning and prepping your data, you will not get good results. Once you have established how each data object is represented in a form ready for a data mining algorithm, and the data are clean, you will have a substantial part of your battle toward modeling solved.\n",
|
||
"- Strive to generate good summary statistics, show what the data looks like, and include good EDA and visualizations with boxplots, barcharts, density plots for key variables, or whatever other plots you want that are specific to your data and problem to help the reader understand basic distributions of important variables. Visualizations can help you convey general info about your data and are extremely helpful.\n",
|
||
"- In your final cells, discuss the modeling methods you expect to use. Start by clearly explaining if this is a classification, regression, clustering, or association rule mining problem? Justify. You have much of the framework to apply most algorithms, even those beyond what we covered in class. Feel free to explore different methods if you have good justification for doing so. If there are any papers of significance that have been published with these data, then discuss the ones most interesting/relevant to the team.\n",
|
||
"- Finally, what is your overarching aim with this project? What are you hoping to learn? Or, what hypothesis are you using the data to confirm or disprove? What challenges do you foresee on this project? Discuss your concerns. How will you get your work done? Give a reasonable list of milestones to reach to arrive at the final deadline for the project."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Problem Summary\n",
|
||
"We are conducting a data mining project focused on analyzing driver performance in Formula One. Our goal is to correlate driver performance with track and weather conditions, and to predict future race results using these correlations. We will apply various data mining techniques learned throughout the course to extract meaningful insights from the dataset."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T02:14:34.811557Z",
|
||
"start_time": "2024-11-20T02:14:34.804489Z"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Importing Libraries\n",
|
||
"import pandas as pd\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import seaborn as sns\n",
|
||
"import os\n",
|
||
"\n",
|
||
"from fastf1.ergast.structure import FastestLap"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T02:14:38.532179Z",
|
||
"start_time": "2024-11-20T02:14:36.799495Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"req WARNING \tDEFAULT CACHE ENABLED! (188.1 MB) /Users/soconnor/Library/Caches/fastf1\n",
|
||
"core INFO \tLoading data for Italian Grand Prix - Qualifying [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['16', '44', '77', '5', '3', '27', '55', '23', '18', '7', '99', '20', '26', '4', '10', '8', '11', '63', '88', '33']\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"image/png": "",
|
||
"text/plain": [
|
||
"<Figure size 640x480 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# FastF1 Example\n",
|
||
"import fastf1\n",
|
||
"import fastf1.plotting\n",
|
||
"\n",
|
||
"fastf1.plotting.setup_mpl(misc_mpl_mods=False, color_scheme=None)\n",
|
||
"\n",
|
||
"session = fastf1.get_session(2019, 'Monza', 'Q')\n",
|
||
"session.load()\n",
|
||
"\n",
|
||
"fast_leclerc = session.laps.pick_drivers('LEC').pick_fastest()\n",
|
||
"lec_car_data = fast_leclerc.get_car_data()\n",
|
||
"t = lec_car_data['Time']\n",
|
||
"vCar = lec_car_data['Speed']\n",
|
||
"\n",
|
||
"# The rest is just plotting\n",
|
||
"fig, ax = plt.subplots()\n",
|
||
"ax.plot(t, vCar, label='Fast')\n",
|
||
"ax.set_xlabel('Time')\n",
|
||
"ax.set_ylabel('Speed [Km/h]')\n",
|
||
"ax.set_title('Leclerc is')\n",
|
||
"ax.legend()\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T03:31:25.964278Z",
|
||
"start_time": "2024-11-20T03:29:39.724591Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"events WARNING \tCorrecting user input 'Bahrain' to 'Bahrain Grand Prix'\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Qualifying [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['44', '77', '33', '23', '11', '3', '31', '10', '4', '26', '5', '16', '18', '63', '55', '99', '7', '20', '8', '6']\n",
|
||
"events WARNING \tCorrecting user input 'Bahrain' to 'Bahrain Grand Prix'\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Race [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"Request for URL https://ergast.com/api/f1/2020/15/results.json failed; using cached response\n",
|
||
"Traceback (most recent call last):\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 536, in _make_request\n",
|
||
" response = conn.getresponse()\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/connection.py\", line 507, in getresponse\n",
|
||
" httplib_response = super().getresponse()\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/http/client.py\", line 1375, in getresponse\n",
|
||
" response.begin()\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/http/client.py\", line 318, in begin\n",
|
||
" version, status, reason = self._read_status()\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/http/client.py\", line 279, in _read_status\n",
|
||
" line = str(self.fp.readline(_MAXLINE + 1), \"iso-8859-1\")\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/socket.py\", line 705, in readinto\n",
|
||
" return self._sock.recv_into(b)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/ssl.py\", line 1307, in recv_into\n",
|
||
" return self.read(nbytes, buffer)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/ssl.py\", line 1163, in read\n",
|
||
" return self._sslobj.read(len, buffer)\n",
|
||
"TimeoutError: The read operation timed out\n",
|
||
"\n",
|
||
"The above exception was the direct cause of the following exception:\n",
|
||
"\n",
|
||
"Traceback (most recent call last):\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/requests/adapters.py\", line 667, in send\n",
|
||
" resp = conn.urlopen(\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 843, in urlopen\n",
|
||
" retries = retries.increment(\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/util/retry.py\", line 474, in increment\n",
|
||
" raise reraise(type(error), error, _stacktrace)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/util/util.py\", line 39, in reraise\n",
|
||
" raise value\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 789, in urlopen\n",
|
||
" response = self._make_request(\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 538, in _make_request\n",
|
||
" self._raise_timeout(err=e, url=url, timeout_value=read_timeout)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/urllib3/connectionpool.py\", line 369, in _raise_timeout\n",
|
||
" raise ReadTimeoutError(\n",
|
||
"urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='ergast.com', port=443): Read timed out. (read timeout=5.0)\n",
|
||
"\n",
|
||
"During handling of the above exception, another exception occurred:\n",
|
||
"\n",
|
||
"Traceback (most recent call last):\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/requests_cache/session.py\", line 286, in _resend\n",
|
||
" response = self._send_and_cache(request, actions, cached_response, **kwargs)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/requests_cache/session.py\", line 254, in _send_and_cache\n",
|
||
" response = super().send(request, **kwargs)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/fastf1/req.py\", line 136, in send\n",
|
||
" return super().send(request, **kwargs)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/requests/sessions.py\", line 703, in send\n",
|
||
" r = adapter.send(request, **kwargs)\n",
|
||
" File \"/opt/homebrew/Caskroom/miniconda/base/envs/csci349/lib/python3.10/site-packages/requests/adapters.py\", line 713, in send\n",
|
||
" raise ReadTimeout(e, request=request)\n",
|
||
"requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='ergast.com', port=443): Read timed out. (read timeout=5.0)\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for lap_count\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['44', '33', '23', '4', '55', '10', '3', '77', '31', '16', '26', '63', '5', '6', '7', '99', '20', '11', '18', '8']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Qualifying [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['33', '44', '77', '16', '10', '3', '4', '55', '14', '18', '11', '99', '22', '7', '63', '31', '6', '5', '47', '9']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Race [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for lap_count\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['44', '33', '77', '4', '11', '16', '3', '55', '22', '18', '7', '99', '31', '63', '5', '47', '10', '6', '14', '9']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Qualifying [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['16', '1', '55', '11', '44', '77', '20', '14', '63', '10', '31', '47', '4', '23', '24', '22', '27', '3', '18', '6']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Race [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for lap_count\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['16', '55', '44', '63', '20', '77', '31', '22', '14', '24', '47', '18', '23', '3', '4', '6', '27', '11', '1', '10']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Qualifying [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['1', '11', '16', '55', '14', '63', '44', '18', '31', '27', '4', '77', '24', '22', '23', '2', '20', '81', '21', '10']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Race [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for lap_count\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['1', '11', '14', '55', '44', '18', '63', '77', '10', '23', '22', '2', '20', '21', '27', '24', '4', '31', '16', '81']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Qualifying [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['1', '16', '63', '55', '11', '14', '4', '81', '44', '27', '22', '18', '23', '3', '20', '77', '24', '2', '31', '10']\n",
|
||
"core INFO \tLoading data for Bahrain Grand Prix - Race [v3.4.4]\n",
|
||
"req INFO \tUsing cached data for session_info\n",
|
||
"req INFO \tUsing cached data for driver_info\n",
|
||
"req INFO \tUsing cached data for session_status_data\n",
|
||
"req INFO \tUsing cached data for lap_count\n",
|
||
"req INFO \tUsing cached data for track_status_data\n",
|
||
"req INFO \tUsing cached data for _extended_timing_data\n",
|
||
"req INFO \tUsing cached data for timing_app_data\n",
|
||
"core INFO \tProcessing timing data...\n",
|
||
"req INFO \tUsing cached data for car_data\n",
|
||
"req INFO \tUsing cached data for position_data\n",
|
||
"req INFO \tUsing cached data for weather_data\n",
|
||
"req INFO \tUsing cached data for race_control_messages\n",
|
||
"core INFO \tFinished loading data for 20 drivers: ['1', '11', '55', '16', '63', '4', '44', '81', '14', '18', '24', '20', '3', '22', '23', '27', '31', '10', '77', '2']\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Weather Data:\n",
|
||
" Time AirTemp Humidity Pressure Rainfall TrackTemp \\\n",
|
||
"0 0 days 00:00:33.157000 26.9 52.6 1015.9 False 28.7 \n",
|
||
"1 0 days 00:01:33.168000 26.9 52.7 1016.0 False 28.6 \n",
|
||
"2 0 days 00:02:33.172000 26.8 52.8 1015.9 False 28.5 \n",
|
||
"3 0 days 00:03:33.168000 26.8 53.0 1015.9 False 28.5 \n",
|
||
"4 0 days 00:04:33.155000 26.7 53.2 1016.0 False 28.5 \n",
|
||
"\n",
|
||
" WindDirection WindSpeed Year Session \n",
|
||
"0 305 0.6 2020 Q \n",
|
||
"1 40 0.8 2020 Q \n",
|
||
"2 341 0.8 2020 Q \n",
|
||
"3 295 0.4 2020 Q \n",
|
||
"4 347 0.5 2020 Q \n",
|
||
"Lap Data:\n",
|
||
" Time Driver DriverNumber LapTime \\\n",
|
||
"0 0 days 00:23:28.426000 HAM 44 NaT \n",
|
||
"1 0 days 00:24:56.769000 HAM 44 0 days 00:01:28.343000 \n",
|
||
"2 0 days 00:26:46.183000 HAM 44 0 days 00:01:49.414000 \n",
|
||
"3 0 days 00:32:41.745000 HAM 44 NaT \n",
|
||
"4 0 days 00:34:21.973000 HAM 44 0 days 00:01:40.228000 \n",
|
||
"\n",
|
||
" LapNumber Stint PitOutTime PitInTime \\\n",
|
||
"0 1.0 1.0 0 days 00:21:22.161000 NaT \n",
|
||
"1 2.0 1.0 NaT NaT \n",
|
||
"2 3.0 1.0 NaT 0 days 00:26:44.401000 \n",
|
||
"3 4.0 2.0 0 days 00:30:17.211000 NaT \n",
|
||
"4 5.0 2.0 NaT 0 days 00:34:20.228000 \n",
|
||
"\n",
|
||
" Sector1Time Sector2Time ... LapStartTime \\\n",
|
||
"0 NaT 0 days 00:00:57.104000 ... 0 days 00:21:22.161000 \n",
|
||
"1 0 days 00:00:28.083000 0 days 00:00:38.020000 ... 0 days 00:23:28.426000 \n",
|
||
"2 0 days 00:00:34.081000 0 days 00:00:45.383000 ... 0 days 00:24:56.769000 \n",
|
||
"3 NaT 0 days 00:01:06.133000 ... 0 days 00:26:46.183000 \n",
|
||
"4 0 days 00:00:28.239000 0 days 00:00:45.630000 ... 0 days 00:32:41.745000 \n",
|
||
"\n",
|
||
" LapStartDate TrackStatus Position Deleted DeletedReason \\\n",
|
||
"0 2020-11-28 14:06:22.193 1 NaN False \n",
|
||
"1 2020-11-28 14:08:28.458 1 NaN False \n",
|
||
"2 2020-11-28 14:09:56.801 1 NaN False \n",
|
||
"3 2020-11-28 14:11:46.215 1 NaN False \n",
|
||
"4 2020-11-28 14:17:41.777 1 NaN False \n",
|
||
"\n",
|
||
" FastF1Generated IsAccurate Year Session \n",
|
||
"0 False False 2020 Q \n",
|
||
"1 False True 2020 Q \n",
|
||
"2 False False 2020 Q \n",
|
||
"3 False False 2020 Q \n",
|
||
"4 False False 2020 Q \n",
|
||
"\n",
|
||
"[5 rows x 33 columns]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# Define the cache directory\n",
|
||
"cache_dir = '../data/cache'\n",
|
||
"if not os.path.exists(cache_dir):\n",
|
||
" os.makedirs(cache_dir)\n",
|
||
"\n",
|
||
"fastf1.Cache.enable_cache(cache_dir)\n",
|
||
"\n",
|
||
"# Years and sessions of interest\n",
|
||
"years = [2020, 2021, 2022, 2023, 2024]\n",
|
||
"sessions = ['Q', 'Race'] # Qualifying and Race sessions\n",
|
||
"event_name = 'Bahrain' # Example event name\n",
|
||
"\n",
|
||
"# Data holders\n",
|
||
"weather_data_list = []\n",
|
||
"lap_data_list = []\n",
|
||
"\n",
|
||
"# Loop through years and sessions\n",
|
||
"for year in years:\n",
|
||
" for session_name in sessions:\n",
|
||
" try:\n",
|
||
" # Load the session\n",
|
||
" session = fastf1.get_session(year, event_name, session_name)\n",
|
||
" session.load()\n",
|
||
" \n",
|
||
" # Process weather data\n",
|
||
" weather_data = session.weather_data\n",
|
||
" weather_df = pd.DataFrame(weather_data)\n",
|
||
" weather_df['Year'] = year\n",
|
||
" weather_df['Session'] = session_name\n",
|
||
" weather_data_list.append(weather_df)\n",
|
||
"\n",
|
||
" # Process lap data\n",
|
||
" lap_data = session.laps\n",
|
||
" lap_df = pd.DataFrame(lap_data)\n",
|
||
" lap_df['Year'] = year\n",
|
||
" lap_df['Session'] = session_name\n",
|
||
" lap_data_list.append(lap_df)\n",
|
||
" \n",
|
||
" except Exception as e:\n",
|
||
" print(f\"Error with {event_name} {session_name} ({year}): {e}\")\n",
|
||
"\n",
|
||
"# Combine weather and lap data into separate DataFrames\n",
|
||
"if weather_data_list:\n",
|
||
" weather_data_combined = pd.concat(weather_data_list, ignore_index=True)\n",
|
||
" print(\"Weather Data:\")\n",
|
||
" print(weather_data_combined.head())\n",
|
||
"\n",
|
||
"if lap_data_list:\n",
|
||
" lap_data_combined = pd.concat(lap_data_list, ignore_index=True)\n",
|
||
" print(\"Lap Data:\")\n",
|
||
" print(lap_data_combined.head())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T03:39:01.456623Z",
|
||
"start_time": "2024-11-20T03:39:01.428009Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Time</th>\n",
|
||
" <th>AirTemp</th>\n",
|
||
" <th>Humidity</th>\n",
|
||
" <th>Pressure</th>\n",
|
||
" <th>Rainfall</th>\n",
|
||
" <th>TrackTemp</th>\n",
|
||
" <th>WindDirection</th>\n",
|
||
" <th>WindSpeed</th>\n",
|
||
" <th>Year</th>\n",
|
||
" <th>Session</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>0 days 00:00:33.157000</td>\n",
|
||
" <td>26.9</td>\n",
|
||
" <td>52.6</td>\n",
|
||
" <td>1015.9</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>28.7</td>\n",
|
||
" <td>305</td>\n",
|
||
" <td>0.6</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>0 days 00:01:33.168000</td>\n",
|
||
" <td>26.9</td>\n",
|
||
" <td>52.7</td>\n",
|
||
" <td>1016.0</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>28.6</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>0.8</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>0 days 00:02:33.172000</td>\n",
|
||
" <td>26.8</td>\n",
|
||
" <td>52.8</td>\n",
|
||
" <td>1015.9</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>28.5</td>\n",
|
||
" <td>341</td>\n",
|
||
" <td>0.8</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>0 days 00:03:33.168000</td>\n",
|
||
" <td>26.8</td>\n",
|
||
" <td>53.0</td>\n",
|
||
" <td>1015.9</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>28.5</td>\n",
|
||
" <td>295</td>\n",
|
||
" <td>0.4</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>0 days 00:04:33.155000</td>\n",
|
||
" <td>26.7</td>\n",
|
||
" <td>53.2</td>\n",
|
||
" <td>1016.0</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>28.5</td>\n",
|
||
" <td>347</td>\n",
|
||
" <td>0.5</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Time AirTemp Humidity Pressure Rainfall TrackTemp \\\n",
|
||
"0 0 days 00:00:33.157000 26.9 52.6 1015.9 False 28.7 \n",
|
||
"1 0 days 00:01:33.168000 26.9 52.7 1016.0 False 28.6 \n",
|
||
"2 0 days 00:02:33.172000 26.8 52.8 1015.9 False 28.5 \n",
|
||
"3 0 days 00:03:33.168000 26.8 53.0 1015.9 False 28.5 \n",
|
||
"4 0 days 00:04:33.155000 26.7 53.2 1016.0 False 28.5 \n",
|
||
"\n",
|
||
" WindDirection WindSpeed Year Session \n",
|
||
"0 305 0.6 2020 Q \n",
|
||
"1 40 0.8 2020 Q \n",
|
||
"2 341 0.8 2020 Q \n",
|
||
"3 295 0.4 2020 Q \n",
|
||
"4 347 0.5 2020 Q "
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Display data\n",
|
||
"weather_data_combined.head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T04:00:16.003135Z",
|
||
"start_time": "2024-11-20T04:00:15.970644Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Time</th>\n",
|
||
" <th>Driver</th>\n",
|
||
" <th>DriverNumber</th>\n",
|
||
" <th>LapTime</th>\n",
|
||
" <th>LapNumber</th>\n",
|
||
" <th>Stint</th>\n",
|
||
" <th>PitOutTime</th>\n",
|
||
" <th>PitInTime</th>\n",
|
||
" <th>Sector1Time</th>\n",
|
||
" <th>Sector2Time</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>LapStartTime</th>\n",
|
||
" <th>LapStartDate</th>\n",
|
||
" <th>TrackStatus</th>\n",
|
||
" <th>Position</th>\n",
|
||
" <th>Deleted</th>\n",
|
||
" <th>DeletedReason</th>\n",
|
||
" <th>FastF1Generated</th>\n",
|
||
" <th>IsAccurate</th>\n",
|
||
" <th>Year</th>\n",
|
||
" <th>Session</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>0 days 00:23:28.426000</td>\n",
|
||
" <td>HAM</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>0 days 00:21:22.161000</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>0 days 00:00:57.104000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:21:22.161000</td>\n",
|
||
" <td>2020-11-28 14:06:22.193</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td></td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>0 days 00:24:56.769000</td>\n",
|
||
" <td>HAM</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>0 days 00:01:28.343000</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>0 days 00:00:28.083000</td>\n",
|
||
" <td>0 days 00:00:38.020000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:23:28.426000</td>\n",
|
||
" <td>2020-11-28 14:08:28.458</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td></td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>0 days 00:26:46.183000</td>\n",
|
||
" <td>HAM</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>0 days 00:01:49.414000</td>\n",
|
||
" <td>3.0</td>\n",
|
||
" <td>1.0</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>0 days 00:26:44.401000</td>\n",
|
||
" <td>0 days 00:00:34.081000</td>\n",
|
||
" <td>0 days 00:00:45.383000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:24:56.769000</td>\n",
|
||
" <td>2020-11-28 14:09:56.801</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td></td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>0 days 00:32:41.745000</td>\n",
|
||
" <td>HAM</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>4.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>0 days 00:30:17.211000</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>0 days 00:01:06.133000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:26:46.183000</td>\n",
|
||
" <td>2020-11-28 14:11:46.215</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td></td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>0 days 00:34:21.973000</td>\n",
|
||
" <td>HAM</td>\n",
|
||
" <td>44</td>\n",
|
||
" <td>0 days 00:01:40.228000</td>\n",
|
||
" <td>5.0</td>\n",
|
||
" <td>2.0</td>\n",
|
||
" <td>NaT</td>\n",
|
||
" <td>0 days 00:34:20.228000</td>\n",
|
||
" <td>0 days 00:00:28.239000</td>\n",
|
||
" <td>0 days 00:00:45.630000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:32:41.745000</td>\n",
|
||
" <td>2020-11-28 14:17:41.777</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td></td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>2020</td>\n",
|
||
" <td>Q</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>5 rows × 33 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Time Driver DriverNumber LapTime \\\n",
|
||
"0 0 days 00:23:28.426000 HAM 44 NaT \n",
|
||
"1 0 days 00:24:56.769000 HAM 44 0 days 00:01:28.343000 \n",
|
||
"2 0 days 00:26:46.183000 HAM 44 0 days 00:01:49.414000 \n",
|
||
"3 0 days 00:32:41.745000 HAM 44 NaT \n",
|
||
"4 0 days 00:34:21.973000 HAM 44 0 days 00:01:40.228000 \n",
|
||
"\n",
|
||
" LapNumber Stint PitOutTime PitInTime \\\n",
|
||
"0 1.0 1.0 0 days 00:21:22.161000 NaT \n",
|
||
"1 2.0 1.0 NaT NaT \n",
|
||
"2 3.0 1.0 NaT 0 days 00:26:44.401000 \n",
|
||
"3 4.0 2.0 0 days 00:30:17.211000 NaT \n",
|
||
"4 5.0 2.0 NaT 0 days 00:34:20.228000 \n",
|
||
"\n",
|
||
" Sector1Time Sector2Time ... LapStartTime \\\n",
|
||
"0 NaT 0 days 00:00:57.104000 ... 0 days 00:21:22.161000 \n",
|
||
"1 0 days 00:00:28.083000 0 days 00:00:38.020000 ... 0 days 00:23:28.426000 \n",
|
||
"2 0 days 00:00:34.081000 0 days 00:00:45.383000 ... 0 days 00:24:56.769000 \n",
|
||
"3 NaT 0 days 00:01:06.133000 ... 0 days 00:26:46.183000 \n",
|
||
"4 0 days 00:00:28.239000 0 days 00:00:45.630000 ... 0 days 00:32:41.745000 \n",
|
||
"\n",
|
||
" LapStartDate TrackStatus Position Deleted DeletedReason \\\n",
|
||
"0 2020-11-28 14:06:22.193 1 NaN False \n",
|
||
"1 2020-11-28 14:08:28.458 1 NaN False \n",
|
||
"2 2020-11-28 14:09:56.801 1 NaN False \n",
|
||
"3 2020-11-28 14:11:46.215 1 NaN False \n",
|
||
"4 2020-11-28 14:17:41.777 1 NaN False \n",
|
||
"\n",
|
||
" FastF1Generated IsAccurate Year Session \n",
|
||
"0 False False 2020 Q \n",
|
||
"1 False True 2020 Q \n",
|
||
"2 False False 2020 Q \n",
|
||
"3 False False 2020 Q \n",
|
||
"4 False False 2020 Q \n",
|
||
"\n",
|
||
"[5 rows x 33 columns]"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"lap_data_combined.head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T03:42:50.147348Z",
|
||
"start_time": "2024-11-20T03:42:50.070096Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 1244 entries, 0 to 1243\n",
|
||
"Data columns (total 10 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 Time 1244 non-null timedelta64[ns]\n",
|
||
" 1 AirTemp 1244 non-null float64 \n",
|
||
" 2 Humidity 1244 non-null float64 \n",
|
||
" 3 Pressure 1244 non-null float64 \n",
|
||
" 4 Rainfall 1244 non-null bool \n",
|
||
" 5 TrackTemp 1244 non-null float64 \n",
|
||
" 6 WindDirection 1244 non-null int64 \n",
|
||
" 7 WindSpeed 1244 non-null float64 \n",
|
||
" 8 Year 1244 non-null int64 \n",
|
||
" 9 Session 1244 non-null object \n",
|
||
"dtypes: bool(1), float64(5), int64(2), object(1), timedelta64[ns](1)\n",
|
||
"memory usage: 88.8+ KB\n",
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 6628 entries, 0 to 6627\n",
|
||
"Data columns (total 33 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 Time 6628 non-null timedelta64[ns]\n",
|
||
" 1 Driver 6628 non-null object \n",
|
||
" 2 DriverNumber 6628 non-null object \n",
|
||
" 3 LapTime 6058 non-null timedelta64[ns]\n",
|
||
" 4 LapNumber 6628 non-null float64 \n",
|
||
" 5 Stint 6628 non-null float64 \n",
|
||
" 6 PitOutTime 689 non-null timedelta64[ns]\n",
|
||
" 7 PitInTime 694 non-null timedelta64[ns]\n",
|
||
" 8 Sector1Time 6056 non-null timedelta64[ns]\n",
|
||
" 9 Sector2Time 6599 non-null timedelta64[ns]\n",
|
||
" 10 Sector3Time 6566 non-null timedelta64[ns]\n",
|
||
" 11 Sector1SessionTime 6048 non-null timedelta64[ns]\n",
|
||
" 12 Sector2SessionTime 6599 non-null timedelta64[ns]\n",
|
||
" 13 Sector3SessionTime 6566 non-null timedelta64[ns]\n",
|
||
" 14 SpeedI1 5394 non-null float64 \n",
|
||
" 15 SpeedI2 6601 non-null float64 \n",
|
||
" 16 SpeedFL 5926 non-null float64 \n",
|
||
" 17 SpeedST 5944 non-null float64 \n",
|
||
" 18 IsPersonalBest 6622 non-null object \n",
|
||
" 19 Compound 6628 non-null object \n",
|
||
" 20 TyreLife 6628 non-null float64 \n",
|
||
" 21 FreshTyre 6628 non-null bool \n",
|
||
" 22 Team 6628 non-null object \n",
|
||
" 23 LapStartTime 6628 non-null timedelta64[ns]\n",
|
||
" 24 LapStartDate 6622 non-null datetime64[ns] \n",
|
||
" 25 TrackStatus 6628 non-null object \n",
|
||
" 26 Position 5349 non-null float64 \n",
|
||
" 27 Deleted 6628 non-null bool \n",
|
||
" 28 DeletedReason 6622 non-null object \n",
|
||
" 29 FastF1Generated 6628 non-null bool \n",
|
||
" 30 IsAccurate 6628 non-null bool \n",
|
||
" 31 Year 6628 non-null int64 \n",
|
||
" 32 Session 6628 non-null object \n",
|
||
"dtypes: bool(4), datetime64[ns](1), float64(8), int64(1), object(8), timedelta64[ns](11)\n",
|
||
"memory usage: 1.5+ MB\n",
|
||
"Time 1244\n",
|
||
"AirTemp 107\n",
|
||
"Humidity 240\n",
|
||
"Pressure 62\n",
|
||
"Rainfall 1\n",
|
||
"TrackTemp 144\n",
|
||
"WindDirection 301\n",
|
||
"WindSpeed 31\n",
|
||
"Year 5\n",
|
||
"Session 2\n",
|
||
"dtype: int64\n",
|
||
"Time 6622\n",
|
||
"Driver 29\n",
|
||
"DriverNumber 30\n",
|
||
"LapTime 4811\n",
|
||
"LapNumber 57\n",
|
||
"Stint 7\n",
|
||
"PitOutTime 689\n",
|
||
"PitInTime 694\n",
|
||
"Sector1Time 3257\n",
|
||
"Sector2Time 4280\n",
|
||
"Sector3Time 3395\n",
|
||
"Sector1SessionTime 6043\n",
|
||
"Sector2SessionTime 6598\n",
|
||
"Sector3SessionTime 6558\n",
|
||
"SpeedI1 175\n",
|
||
"SpeedI2 204\n",
|
||
"SpeedFL 171\n",
|
||
"SpeedST 288\n",
|
||
"IsPersonalBest 2\n",
|
||
"Compound 3\n",
|
||
"TyreLife 37\n",
|
||
"FreshTyre 2\n",
|
||
"Team 15\n",
|
||
"LapStartTime 6511\n",
|
||
"LapStartDate 6509\n",
|
||
"TrackStatus 18\n",
|
||
"Position 20\n",
|
||
"Deleted 2\n",
|
||
"DeletedReason 38\n",
|
||
"FastF1Generated 2\n",
|
||
"IsAccurate 2\n",
|
||
"Year 5\n",
|
||
"Session 2\n",
|
||
"dtype: int64\n",
|
||
"Time 0\n",
|
||
"AirTemp 0\n",
|
||
"Humidity 0\n",
|
||
"Pressure 0\n",
|
||
"Rainfall 0\n",
|
||
"TrackTemp 0\n",
|
||
"WindDirection 0\n",
|
||
"WindSpeed 0\n",
|
||
"Year 0\n",
|
||
"Session 0\n",
|
||
"dtype: int64\n",
|
||
"Time 0\n",
|
||
"Driver 0\n",
|
||
"DriverNumber 0\n",
|
||
"LapTime 570\n",
|
||
"LapNumber 0\n",
|
||
"Stint 0\n",
|
||
"PitOutTime 5939\n",
|
||
"PitInTime 5934\n",
|
||
"Sector1Time 572\n",
|
||
"Sector2Time 29\n",
|
||
"Sector3Time 62\n",
|
||
"Sector1SessionTime 580\n",
|
||
"Sector2SessionTime 29\n",
|
||
"Sector3SessionTime 62\n",
|
||
"SpeedI1 1234\n",
|
||
"SpeedI2 27\n",
|
||
"SpeedFL 702\n",
|
||
"SpeedST 684\n",
|
||
"IsPersonalBest 6\n",
|
||
"Compound 0\n",
|
||
"TyreLife 0\n",
|
||
"FreshTyre 0\n",
|
||
"Team 0\n",
|
||
"LapStartTime 0\n",
|
||
"LapStartDate 6\n",
|
||
"TrackStatus 0\n",
|
||
"Position 1279\n",
|
||
"Deleted 0\n",
|
||
"DeletedReason 6\n",
|
||
"FastF1Generated 0\n",
|
||
"IsAccurate 0\n",
|
||
"Year 0\n",
|
||
"Session 0\n",
|
||
"dtype: int64\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"#What does our data look like?\n",
|
||
"weather_data_combined.info()\n",
|
||
"lap_data_combined.info()\n",
|
||
"\n",
|
||
"#How many unique values do we have?\n",
|
||
"print(weather_data_combined.nunique())\n",
|
||
"print(lap_data_combined.nunique())\n",
|
||
"\n",
|
||
"#Are there any missing values?\n",
|
||
"print(weather_data_combined.isnull().sum())\n",
|
||
"print(lap_data_combined.isnull().sum())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T03:38:18.767017Z",
|
||
"start_time": "2024-11-20T03:38:18.692890Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Time</th>\n",
|
||
" <th>AirTemp</th>\n",
|
||
" <th>Humidity</th>\n",
|
||
" <th>Pressure</th>\n",
|
||
" <th>TrackTemp</th>\n",
|
||
" <th>WindDirection</th>\n",
|
||
" <th>WindSpeed</th>\n",
|
||
" <th>Year</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>count</th>\n",
|
||
" <td>1244</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" <td>1244.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>0 days 01:10:54.298340032</td>\n",
|
||
" <td>23.295579</td>\n",
|
||
" <td>45.302331</td>\n",
|
||
" <td>1015.170096</td>\n",
|
||
" <td>27.604582</td>\n",
|
||
" <td>152.864148</td>\n",
|
||
" <td>0.776367</td>\n",
|
||
" <td>2021.926849</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>std</th>\n",
|
||
" <td>0 days 00:49:12.934930673</td>\n",
|
||
" <td>3.554556</td>\n",
|
||
" <td>16.844089</td>\n",
|
||
" <td>2.753137</td>\n",
|
||
" <td>3.205810</td>\n",
|
||
" <td>132.617082</td>\n",
|
||
" <td>0.498481</td>\n",
|
||
" <td>1.447483</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>0 days 00:00:14.093000</td>\n",
|
||
" <td>17.600000</td>\n",
|
||
" <td>15.000000</td>\n",
|
||
" <td>1008.400000</td>\n",
|
||
" <td>20.800000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" <td>2020.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25%</th>\n",
|
||
" <td>0 days 00:31:15.284250</td>\n",
|
||
" <td>19.700000</td>\n",
|
||
" <td>33.000000</td>\n",
|
||
" <td>1013.700000</td>\n",
|
||
" <td>26.200000</td>\n",
|
||
" <td>15.000000</td>\n",
|
||
" <td>0.500000</td>\n",
|
||
" <td>2021.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>50%</th>\n",
|
||
" <td>0 days 01:02:24.641500</td>\n",
|
||
" <td>24.300000</td>\n",
|
||
" <td>49.000000</td>\n",
|
||
" <td>1016.100000</td>\n",
|
||
" <td>27.600000</td>\n",
|
||
" <td>161.000000</td>\n",
|
||
" <td>0.700000</td>\n",
|
||
" <td>2022.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75%</th>\n",
|
||
" <td>0 days 01:43:50.269000</td>\n",
|
||
" <td>26.100000</td>\n",
|
||
" <td>56.125000</td>\n",
|
||
" <td>1017.000000</td>\n",
|
||
" <td>29.300000</td>\n",
|
||
" <td>294.000000</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>2023.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>0 days 03:35:39.278000</td>\n",
|
||
" <td>30.400000</td>\n",
|
||
" <td>75.500000</td>\n",
|
||
" <td>1019.300000</td>\n",
|
||
" <td>35.600000</td>\n",
|
||
" <td>359.000000</td>\n",
|
||
" <td>3.100000</td>\n",
|
||
" <td>2024.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Time AirTemp Humidity Pressure \\\n",
|
||
"count 1244 1244.000000 1244.000000 1244.000000 \n",
|
||
"mean 0 days 01:10:54.298340032 23.295579 45.302331 1015.170096 \n",
|
||
"std 0 days 00:49:12.934930673 3.554556 16.844089 2.753137 \n",
|
||
"min 0 days 00:00:14.093000 17.600000 15.000000 1008.400000 \n",
|
||
"25% 0 days 00:31:15.284250 19.700000 33.000000 1013.700000 \n",
|
||
"50% 0 days 01:02:24.641500 24.300000 49.000000 1016.100000 \n",
|
||
"75% 0 days 01:43:50.269000 26.100000 56.125000 1017.000000 \n",
|
||
"max 0 days 03:35:39.278000 30.400000 75.500000 1019.300000 \n",
|
||
"\n",
|
||
" TrackTemp WindDirection WindSpeed Year \n",
|
||
"count 1244.000000 1244.000000 1244.000000 1244.000000 \n",
|
||
"mean 27.604582 152.864148 0.776367 2021.926849 \n",
|
||
"std 3.205810 132.617082 0.498481 1.447483 \n",
|
||
"min 20.800000 0.000000 0.000000 2020.000000 \n",
|
||
"25% 26.200000 15.000000 0.500000 2021.000000 \n",
|
||
"50% 27.600000 161.000000 0.700000 2022.000000 \n",
|
||
"75% 29.300000 294.000000 1.000000 2023.000000 \n",
|
||
"max 35.600000 359.000000 3.100000 2024.000000 "
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"#Describe the data\n",
|
||
"weather_data_combined.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T03:39:18.917017Z",
|
||
"start_time": "2024-11-20T03:39:18.855681Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Time</th>\n",
|
||
" <th>LapTime</th>\n",
|
||
" <th>LapNumber</th>\n",
|
||
" <th>Stint</th>\n",
|
||
" <th>PitOutTime</th>\n",
|
||
" <th>PitInTime</th>\n",
|
||
" <th>Sector1Time</th>\n",
|
||
" <th>Sector2Time</th>\n",
|
||
" <th>Sector3Time</th>\n",
|
||
" <th>Sector1SessionTime</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>Sector3SessionTime</th>\n",
|
||
" <th>SpeedI1</th>\n",
|
||
" <th>SpeedI2</th>\n",
|
||
" <th>SpeedFL</th>\n",
|
||
" <th>SpeedST</th>\n",
|
||
" <th>TyreLife</th>\n",
|
||
" <th>LapStartTime</th>\n",
|
||
" <th>LapStartDate</th>\n",
|
||
" <th>Position</th>\n",
|
||
" <th>Year</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>count</th>\n",
|
||
" <td>6628</td>\n",
|
||
" <td>6058</td>\n",
|
||
" <td>6628.000000</td>\n",
|
||
" <td>6628.000000</td>\n",
|
||
" <td>689</td>\n",
|
||
" <td>694</td>\n",
|
||
" <td>6056</td>\n",
|
||
" <td>6599</td>\n",
|
||
" <td>6566</td>\n",
|
||
" <td>6048</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>6566</td>\n",
|
||
" <td>5394.000000</td>\n",
|
||
" <td>6601.000000</td>\n",
|
||
" <td>5926.000000</td>\n",
|
||
" <td>5944.000000</td>\n",
|
||
" <td>6628.000000</td>\n",
|
||
" <td>6628</td>\n",
|
||
" <td>6622</td>\n",
|
||
" <td>5349.000000</td>\n",
|
||
" <td>6628.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>0 days 01:42:02.192371303</td>\n",
|
||
" <td>0 days 00:01:41.117758666</td>\n",
|
||
" <td>24.366777</td>\n",
|
||
" <td>2.545866</td>\n",
|
||
" <td>0 days 01:07:15.479603773</td>\n",
|
||
" <td>0 days 01:08:51.778342939</td>\n",
|
||
" <td>0 days 00:00:32.801727873</td>\n",
|
||
" <td>0 days 00:00:44.382851038</td>\n",
|
||
" <td>0 days 00:00:25.863896740</td>\n",
|
||
" <td>0 days 01:45:48.699285218</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 01:42:19.752332013</td>\n",
|
||
" <td>221.138673</td>\n",
|
||
" <td>240.172095</td>\n",
|
||
" <td>277.101249</td>\n",
|
||
" <td>276.023890</td>\n",
|
||
" <td>8.922299</td>\n",
|
||
" <td>0 days 01:40:00.888514634</td>\n",
|
||
" <td>2022-05-19 11:52:27.328777728</td>\n",
|
||
" <td>9.980183</td>\n",
|
||
" <td>2022.045112</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>0 days 00:15:27.765000</td>\n",
|
||
" <td>0 days 00:01:27.264000</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>0 days 00:13:35.553000</td>\n",
|
||
" <td>0 days 00:18:28.415000</td>\n",
|
||
" <td>0 days 00:00:27.669000</td>\n",
|
||
" <td>0 days 00:00:37.715000</td>\n",
|
||
" <td>0 days 00:00:21.853000</td>\n",
|
||
" <td>0 days 00:15:57.525000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:15:28.005000</td>\n",
|
||
" <td>54.000000</td>\n",
|
||
" <td>44.000000</td>\n",
|
||
" <td>42.000000</td>\n",
|
||
" <td>31.000000</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>0 days 00:13:35.553000</td>\n",
|
||
" <td>2020-11-28 14:00:03.421000</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>2020.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25%</th>\n",
|
||
" <td>0 days 01:09:17.505500</td>\n",
|
||
" <td>0 days 00:01:36.219250</td>\n",
|
||
" <td>9.000000</td>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>0 days 00:29:59.107000</td>\n",
|
||
" <td>0 days 00:35:37.662250</td>\n",
|
||
" <td>0 days 00:00:30.732750</td>\n",
|
||
" <td>0 days 00:00:41.676500</td>\n",
|
||
" <td>0 days 00:00:23.736000</td>\n",
|
||
" <td>0 days 01:14:22.595250</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 01:09:21.996000</td>\n",
|
||
" <td>225.000000</td>\n",
|
||
" <td>241.000000</td>\n",
|
||
" <td>277.000000</td>\n",
|
||
" <td>280.000000</td>\n",
|
||
" <td>3.000000</td>\n",
|
||
" <td>0 days 01:06:55.995500</td>\n",
|
||
" <td>2021-03-28 15:21:05.414749952</td>\n",
|
||
" <td>5.000000</td>\n",
|
||
" <td>2021.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>50%</th>\n",
|
||
" <td>0 days 01:39:58.302000</td>\n",
|
||
" <td>0 days 00:01:37.865500</td>\n",
|
||
" <td>22.000000</td>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>0 days 00:59:30.380000</td>\n",
|
||
" <td>0 days 00:58:21.241500</td>\n",
|
||
" <td>0 days 00:00:31.148000</td>\n",
|
||
" <td>0 days 00:00:42.582000</td>\n",
|
||
" <td>0 days 00:00:24.126000</td>\n",
|
||
" <td>0 days 01:44:32.815000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 01:40:39.423500</td>\n",
|
||
" <td>231.000000</td>\n",
|
||
" <td>250.000000</td>\n",
|
||
" <td>281.000000</td>\n",
|
||
" <td>295.000000</td>\n",
|
||
" <td>8.000000</td>\n",
|
||
" <td>0 days 01:38:20.161000</td>\n",
|
||
" <td>2022-03-20 15:46:26.377999872</td>\n",
|
||
" <td>10.000000</td>\n",
|
||
" <td>2022.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75%</th>\n",
|
||
" <td>0 days 02:13:43.248500</td>\n",
|
||
" <td>0 days 00:01:40.391500</td>\n",
|
||
" <td>39.000000</td>\n",
|
||
" <td>3.000000</td>\n",
|
||
" <td>0 days 01:28:09.343000</td>\n",
|
||
" <td>0 days 01:25:50.796000</td>\n",
|
||
" <td>0 days 00:00:31.792000</td>\n",
|
||
" <td>0 days 00:00:43.779500</td>\n",
|
||
" <td>0 days 00:00:24.901750</td>\n",
|
||
" <td>0 days 02:15:42.518250</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 02:14:05.441000</td>\n",
|
||
" <td>235.000000</td>\n",
|
||
" <td>257.000000</td>\n",
|
||
" <td>284.000000</td>\n",
|
||
" <td>303.000000</td>\n",
|
||
" <td>13.000000</td>\n",
|
||
" <td>0 days 02:12:04.575250</td>\n",
|
||
" <td>2023-03-05 16:12:27.612000</td>\n",
|
||
" <td>15.000000</td>\n",
|
||
" <td>2023.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>0 days 03:33:47.428000</td>\n",
|
||
" <td>0 days 00:03:05.092000</td>\n",
|
||
" <td>57.000000</td>\n",
|
||
" <td>7.000000</td>\n",
|
||
" <td>0 days 03:28:04.389000</td>\n",
|
||
" <td>0 days 03:27:38.638000</td>\n",
|
||
" <td>0 days 00:01:39.160000</td>\n",
|
||
" <td>0 days 00:01:27.340000</td>\n",
|
||
" <td>0 days 00:01:10.478000</td>\n",
|
||
" <td>0 days 03:32:33.946000</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 03:33:47.428000</td>\n",
|
||
" <td>248.000000</td>\n",
|
||
" <td>274.000000</td>\n",
|
||
" <td>302.000000</td>\n",
|
||
" <td>333.000000</td>\n",
|
||
" <td>37.000000</td>\n",
|
||
" <td>0 days 03:32:00.121000</td>\n",
|
||
" <td>2024-03-02 16:35:23.280000</td>\n",
|
||
" <td>20.000000</td>\n",
|
||
" <td>2024.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>std</th>\n",
|
||
" <td>0 days 00:44:32.891277624</td>\n",
|
||
" <td>0 days 00:00:10.575132207</td>\n",
|
||
" <td>16.860094</td>\n",
|
||
" <td>1.155031</td>\n",
|
||
" <td>0 days 00:41:40.584927846</td>\n",
|
||
" <td>0 days 00:39:18.227030052</td>\n",
|
||
" <td>0 days 00:00:05.843587609</td>\n",
|
||
" <td>0 days 00:00:06.025195453</td>\n",
|
||
" <td>0 days 00:00:04.862155415</td>\n",
|
||
" <td>0 days 00:42:50.934311129</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0 days 00:44:32.425629872</td>\n",
|
||
" <td>28.861242</td>\n",
|
||
" <td>32.657155</td>\n",
|
||
" <td>22.133798</td>\n",
|
||
" <td>52.878471</td>\n",
|
||
" <td>6.475231</td>\n",
|
||
" <td>0 days 00:44:57.137961013</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>5.511766</td>\n",
|
||
" <td>1.411731</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>8 rows × 21 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Time LapTime LapNumber \\\n",
|
||
"count 6628 6058 6628.000000 \n",
|
||
"mean 0 days 01:42:02.192371303 0 days 00:01:41.117758666 24.366777 \n",
|
||
"min 0 days 00:15:27.765000 0 days 00:01:27.264000 1.000000 \n",
|
||
"25% 0 days 01:09:17.505500 0 days 00:01:36.219250 9.000000 \n",
|
||
"50% 0 days 01:39:58.302000 0 days 00:01:37.865500 22.000000 \n",
|
||
"75% 0 days 02:13:43.248500 0 days 00:01:40.391500 39.000000 \n",
|
||
"max 0 days 03:33:47.428000 0 days 00:03:05.092000 57.000000 \n",
|
||
"std 0 days 00:44:32.891277624 0 days 00:00:10.575132207 16.860094 \n",
|
||
"\n",
|
||
" Stint PitOutTime PitInTime \\\n",
|
||
"count 6628.000000 689 694 \n",
|
||
"mean 2.545866 0 days 01:07:15.479603773 0 days 01:08:51.778342939 \n",
|
||
"min 1.000000 0 days 00:13:35.553000 0 days 00:18:28.415000 \n",
|
||
"25% 2.000000 0 days 00:29:59.107000 0 days 00:35:37.662250 \n",
|
||
"50% 2.000000 0 days 00:59:30.380000 0 days 00:58:21.241500 \n",
|
||
"75% 3.000000 0 days 01:28:09.343000 0 days 01:25:50.796000 \n",
|
||
"max 7.000000 0 days 03:28:04.389000 0 days 03:27:38.638000 \n",
|
||
"std 1.155031 0 days 00:41:40.584927846 0 days 00:39:18.227030052 \n",
|
||
"\n",
|
||
" Sector1Time Sector2Time \\\n",
|
||
"count 6056 6599 \n",
|
||
"mean 0 days 00:00:32.801727873 0 days 00:00:44.382851038 \n",
|
||
"min 0 days 00:00:27.669000 0 days 00:00:37.715000 \n",
|
||
"25% 0 days 00:00:30.732750 0 days 00:00:41.676500 \n",
|
||
"50% 0 days 00:00:31.148000 0 days 00:00:42.582000 \n",
|
||
"75% 0 days 00:00:31.792000 0 days 00:00:43.779500 \n",
|
||
"max 0 days 00:01:39.160000 0 days 00:01:27.340000 \n",
|
||
"std 0 days 00:00:05.843587609 0 days 00:00:06.025195453 \n",
|
||
"\n",
|
||
" Sector3Time Sector1SessionTime ... \\\n",
|
||
"count 6566 6048 ... \n",
|
||
"mean 0 days 00:00:25.863896740 0 days 01:45:48.699285218 ... \n",
|
||
"min 0 days 00:00:21.853000 0 days 00:15:57.525000 ... \n",
|
||
"25% 0 days 00:00:23.736000 0 days 01:14:22.595250 ... \n",
|
||
"50% 0 days 00:00:24.126000 0 days 01:44:32.815000 ... \n",
|
||
"75% 0 days 00:00:24.901750 0 days 02:15:42.518250 ... \n",
|
||
"max 0 days 00:01:10.478000 0 days 03:32:33.946000 ... \n",
|
||
"std 0 days 00:00:04.862155415 0 days 00:42:50.934311129 ... \n",
|
||
"\n",
|
||
" Sector3SessionTime SpeedI1 SpeedI2 SpeedFL \\\n",
|
||
"count 6566 5394.000000 6601.000000 5926.000000 \n",
|
||
"mean 0 days 01:42:19.752332013 221.138673 240.172095 277.101249 \n",
|
||
"min 0 days 00:15:28.005000 54.000000 44.000000 42.000000 \n",
|
||
"25% 0 days 01:09:21.996000 225.000000 241.000000 277.000000 \n",
|
||
"50% 0 days 01:40:39.423500 231.000000 250.000000 281.000000 \n",
|
||
"75% 0 days 02:14:05.441000 235.000000 257.000000 284.000000 \n",
|
||
"max 0 days 03:33:47.428000 248.000000 274.000000 302.000000 \n",
|
||
"std 0 days 00:44:32.425629872 28.861242 32.657155 22.133798 \n",
|
||
"\n",
|
||
" SpeedST TyreLife LapStartTime \\\n",
|
||
"count 5944.000000 6628.000000 6628 \n",
|
||
"mean 276.023890 8.922299 0 days 01:40:00.888514634 \n",
|
||
"min 31.000000 1.000000 0 days 00:13:35.553000 \n",
|
||
"25% 280.000000 3.000000 0 days 01:06:55.995500 \n",
|
||
"50% 295.000000 8.000000 0 days 01:38:20.161000 \n",
|
||
"75% 303.000000 13.000000 0 days 02:12:04.575250 \n",
|
||
"max 333.000000 37.000000 0 days 03:32:00.121000 \n",
|
||
"std 52.878471 6.475231 0 days 00:44:57.137961013 \n",
|
||
"\n",
|
||
" LapStartDate Position Year \n",
|
||
"count 6622 5349.000000 6628.000000 \n",
|
||
"mean 2022-05-19 11:52:27.328777728 9.980183 2022.045112 \n",
|
||
"min 2020-11-28 14:00:03.421000 1.000000 2020.000000 \n",
|
||
"25% 2021-03-28 15:21:05.414749952 5.000000 2021.000000 \n",
|
||
"50% 2022-03-20 15:46:26.377999872 10.000000 2022.000000 \n",
|
||
"75% 2023-03-05 16:12:27.612000 15.000000 2023.000000 \n",
|
||
"max 2024-03-02 16:35:23.280000 20.000000 2024.000000 \n",
|
||
"std NaN 5.511766 1.411731 \n",
|
||
"\n",
|
||
"[8 rows x 21 columns]"
|
||
]
|
||
},
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"lap_data_combined.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T03:40:23.364581Z",
|
||
"start_time": "2024-11-20T03:40:23.214723Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "",
|
||
"text/plain": [
|
||
"<Figure size 1000x600 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"#Visualizations\n",
|
||
"# Boxplot of Weather Data\n",
|
||
"plt.figure(figsize=(10, 6))\n",
|
||
"sns.boxplot(x='Year', y='TrackTemp', data=weather_data_combined)\n",
|
||
"plt.title('Temperature Distribution by Year')\n",
|
||
"plt.show()\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-11-20T04:00:21.344518Z",
|
||
"start_time": "2024-11-20T04:00:21.181130Z"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "",
|
||
"text/plain": [
|
||
"<Figure size 1000x600 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Graph of Fastest Lap Times by Year\n",
|
||
"# Who had the fastest lap time in each year?\n",
|
||
"fastest_lap = lap_data_combined[lap_data_combined['Position'] == 1]\n",
|
||
"# Remove 0 times\n",
|
||
"fastest_lap = fastest_lap[fastest_lap['Time'] != pd.Timedelta(0)]\n",
|
||
"\n",
|
||
"plt.figure(figsize=(10, 6))\n",
|
||
"sns.lineplot(x='Year', y='Time', data=fastest_lap)\n",
|
||
"plt.title('Fastest Lap Times by Year')\n",
|
||
"plt.show()\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "csci349",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.10.13"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|