COVID-19 Data Analysis and Visualization

An analysis project focused on understanding the COVID-19 pandemic through data visualization and analysis. This project involved gathering COVID-19 data @Our World in Data , preprocessing it using Excel, and analyzing it using MySQL. The process was aimed at uncovering insights into the spread and impact of the virus.

Code Development and Procedures

In Excel, data was cleaned and formatted to ensure consistency and accuracy. This involved handling missing values, normalizing data formats for more in-depth analysis. Using MySQL, we performed queries to extract meaningful statistics, such as the number of cases, deaths, and recoveries over time. I also analyzed the data by different dimensions such as location and time periods.

I downloaded COVID-19 data from the OWID website and created two separate Excel worksheets:

  • Covid Deaths
  • Covid Vaccinations
I converted the Excel sheets into CSV format and then imported them into MySQL using MySQL Workbench import function(). After importing the data, I adjusted the data types to ensure they were suitable for querying.

SQL queries

									
-- Total Cases vs Total Deaths

SELECT location, date, total_cases, total_deaths, (total_deaths / total_cases) * 100 AS DeathPercentage
FROM Covid.CovidDeaths
WHERE location LIKE '%states%' AND continent IS NOT NULL
ORDER BY location, date;

-- Countries with Highest Infection Rate Compared to Population

SELECT location, population, MAX(total_cases) AS HighestInfectionCount, MAX((total_cases / population) * 100) AS PercentPopulationInfected
FROM Covid.CovidDeaths
GROUP BY location, population
ORDER BY PercentPopulationInfected DESC;
								

These queries helped in extracting insights such as the percentage of deaths among cases, infection rates, and identifying regions with the highest impact. I created views to simplify data visualization in Tableau. For example, to track the percentage of the population vaccinated.

Observations

The analysis revealed trends and patterns in the spread of COVID-19. Key findings included the identification of peak periods of infection, the effectiveness of various measures taken to curb the spread, and the impact of the virus on different regions. Visualizations created from the data provided clear and impactful insights that can be used for understanding spread of virus over the world. These included graphs showing the rise and fall of case numbers, maps indicating hotspots, and charts comparing the effectiveness of interventions across different regions.

Discover the full SQL scripts and data on my GitHub! Dive into the details at @Covid-Data-Exploration .


GitHub GitHub

Link to COVID data and preprocessing scripts, SQL Scripts and reference files.

Tableau

Link to Tableau Dashboard with several data visualization pages and filters.