COVID-19 Analysis Using Python: Pandas and Matplotlib

Dwi Gustin Nurdialit
3 min readSep 23, 2020
Photo by Stephen Dawson on Unsplash

Hi, I hope you all safe.

As we know, our world is going crazy because of the COVID pandemic. COVID is a pandemic that has spread across the globe. Most of the countries in the world have been infected. The handling of each state is different, according to government policy. This results in differences in the trend of increasing or decreasing COVID cases in other countries.

Through this article, I want to share countries' conditions in the world by analyzing using python.

We have two data frames that I used, “countries.csv” and “covid-countries-data.csv.” You can get from my GitHub repository.

First, we should import the library that we want to use.

Pandas is a software library written for the Python programming language for data manipulation and analysis. In this case, we use pandas to read the data frames file.

Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. In this case, we use matplotlib to visualizing the data.

Then, we read the CSV file by using this code.

We named each data frame using a variable.

Now, let’s merge the two data frames and compute some more metrics. Merge countries_df with covid_data_df on the location column, and named it with combined_df variable.

Then, create a data frame with 10 countries with the highest number of tests per million people and then visualize it.

How about countries that have the highest number of positive cases per million people?

And countries that have the highest number of deaths cases per million people?

We also count the number of countries that feature in both of the lists of “highest number of tests per million” and “highest number of cases per million” by merged the highest tests data frame with the highest cases data frame on the location column, and this is what we get.

The next job is to count the number of countries that feature in both the lists “20 countries with lowest. GDP per capita” and “20 countries with the lowest number of hospital beds per thousand population”. Only consider countries with a population higher than 10 million while creating the list.

20 countries with the lowest. GDP per capita
20 countries with the lowest number of hospital beds per thousand population

By that data frames, we can know the number of countries that feature in both lists

From the data, we got 14 countries that feature in both lists.
So it can be said the country that has a low GDP is most likely to have a low number of hospital beds.

So, that all about COVID analysis using python. We can know anything by analyzed it from the data frames we have.

--

--