Performing Analysis Of Meteorological Data using Python

rohit jagtap
3 min readJun 29, 2021
Change in climate past 10 years

In this blog we will be doing Data cleaning techniques, Data visualization, and hypothesis testing on change in weather of Finland for past 10 years

“The Apparent temperature and humidity compared monthly across 10 years of the data indicate an increase due to Global warming” following is the Hypothesis for the analysis.

The Hypothesis means we need to find whether the average Apparent temperature for the month of a month say April starting from 2006 to 2016 and the average humidity for the same period have increased or not. This monthly analysis has to be done for all 12 months over the 10 year period. So you are basically resampling your data from hourly to monthly, then comparing the same month over the 10 year period. Support your analysis by appropriate visualizations using matplotlib and / or seaborn library.

Content

Step 1: Importing of libraries and Dataset.

Step 2: Looking at the dataset.

Step 3: Cleaning Dataset.

Step 4: Plotting of Data.

Here is link of my Entire code if you like to refer https://colab.research.google.com/drive/1MacKo_MLx6TmVoS4XdZiAVwoaGCzKa8I?usp=sharing

Step 1: Importing of libraries and Dataset

Required Library

Here we are importing pandas ,numpy , matplotlib .Pandas is mainly used for data analysis. Pandas allows various data manipulation operations such as merging, reshaping, selecting, as well as data cleaning, and data wrangling features.NumPy is an open-source numerical Python library. It can be utilised to perform a number of mathematical operations on arrays such as trigonometric, statistical, and algebraic routines.Matplotlib. pyplot is a collection of functions that make matplotlib work like MATLAB. Each pyplot function makes some change to a figure: e.g., creates a figure, creates a plotting area in a figure, plots some lines in a plotting area, decorates the plot with labels, etc. In matplotlib.

Importing dataset

We are importing and calculating the data it contents .Here is link of original dataset :https://www.kaggle.com/muthuj7/weather-dataset

Step 2: Looking at the dataset.

Dataset

Step 3: Cleaning Dataset

In this step we will prepare our data for the plotting , we will first drop the unwanted columns (all except temperature and humidity) .

Removing unwanted Data

Then we are checking null present in selected columns

checking null values

And converting the Timezone to +00:00 UTC .

converting the Timezone to +00:00 UTC .

Step 4: Plotting of Data

In is final step we will plot the data to for the analysis ,

Firstly we will plot the whole dataset for all months .

Graph of past 10 Years

As we can see temperature increases sharply first and then drops sharply to same level as repeatedly for 10 years whereas their is no change in humidity in past 10 years.

graph for April

As we can analyze there isn’t any change in humidity in past 10 years(2006–2010) for the month of April. where as , temperature increases sharply in 2009 and drops in 2015 for rest of the years there isn’t any sharp change in the temperature.

Graph for January month
graph for February Month
Graph for March month
Graph for May month
Graph for June month
Graph for July month
Graph for august month
Graph for September month
Graph for October month
Graph for November month
Graph for December month

Comparing all 12 months analysis in it the month of April to the month of august there is slightly change in temperature but nearly no change in humidity for the 10 years(2006–2010) . Whereas for the month from September to march there is a vast change in the temperature but again humidity remains unchanged.

Conlusion:

In this 10 years of Dataset, we can see as per year increases Apparent temperature and humidity are not related. For all the year monthly average humidity is the same but the Apparent temperature is different. Global warming is affecting the earth’s temperature so that we see some uncertainty in this data.

--

--