Olympics 2024

Automated data extraction, analysis, and visualization of the Olympics dataset using Python, SQL Server, and Tableau.Automated data extraction, analysis, and visualization of the Olympics dataset using Python, SQL Server, and Tableau.

Details

Details

Details

Role:

Role:

Data Analyst

Data Analyst

Data Analyst

Service:

Service:

Python

Python

Python

Tableau

Tableau

Tableau

SQL

SQL

SQL

Industry:

Sports

Sports

Sports

Technology

Technology

Technology

Overview

Overview

Overview

This project automates the end-to-end process of extracting and transforming Olympics data using Python and SQL Server, ensuring data integrity and accessibility. The cleaned and structured data is then visualized in Tableau, providing dynamic insights and trends across various Olympic events and athletes. This streamlined approach enhances data analysis efficiency, enabling faster and more informed decision-making.

Key Features

Key Features

Key Features

  • Automated Data Extraction: A Python script that downloads the latest Olympics dataset from Kaggle, unzips it, and converts CSV files to Excel format.

  • Data Storage and Analysis: The transformed data is stored in SQL Server, where crucial insights are derived using SQL queries.

  • Dynamic Dashboards: The analyzed data is visualized through interactive Tableau dashboards. The user can filter based on the country and get important information about various statistics.

Project Components

Project Components

Project Components

Data Extraction & Transformation:

  1. A Python script is scheduled via Windows Task Scheduler to run daily, ensuring the data is always up-to-date.

  2. The script downloads the dataset, unzips it, converts CSV files to Excel, and stores them in a specified directory.

Data Analysis:

  1. The Excel files are imported into SQL Server.

  2. Various SQL queries are used to join relevant tables and extract meaningful insights from the data.

  3. There are few queries that extract data relevant for Dashboard Development.

Data Visualization:

  1. The analyzed data is connected to Tableau, where dynamic dashboards are created to visualize key metrics and trends.

  2. I have tried to gather as much insights possible as to athlete's age, demographics and likewise.

Insights

Insights

Insights

Overall Performance

  • Top Countries: The USA leads the medal count, followed by China. There's a significant gap between these two and the rest of the top five: France, Great Britain, and Australia.

  • Medal Distribution: While the USA dominates overall, France excels in individual gold medals, and Australia in individual bronze. Great Britain has a higher percentage of medals coming from Team Events as compared to the Top 5.

  • Performance Improvement: Most countries have increased their medal count compared to 2020, with France showing a particularly significant improvement, likely due to the home-field advantage.

Athlete Demographics

  • Age Variation: Athlete ages vary significantly across different sports, with basketball players being the oldest on average.

  • Gender Equality: There's a relatively equal number of male and female athletes participating, with an increase in female participation compared to previous Olympics.

  • Youngest and Oldest Athlete: It can be seen that the youngest athlete in the events to earn a medal was of mere 14 years, whereas the oldest athlete to earn a medal was of 58 years.

Dashboard Effectiveness

  • Visual Appeal: The dashboard is visually appealing and easy to understand due to effective use of color and clear layout.

  • Data Richness: While providing valuable insights, the dashboard could be enhanced with additional data such as medal distribution by sport or athlete performance over time.

In conclusion, the dashboard offers a comprehensive overview of the Paris 2024 Olympics, highlighting key trends in medal counts, athlete demographics, and country performance.