Skip to content

This project conducts a thorough analysis of weather time series data using diverse statistical and deep learning models. Each model was rigorously applied to the same weather time series data to assess and compare their forecasting accuracy. Detailed results and analyses are provided to delineate the strengths and weaknesses of each approach.

License

Notifications You must be signed in to change notification settings

razamehar/Weather-Time-Series-Analysis-using-Statistical-Methods-and-Deep-Learning-Models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Weather Time Series Analysis using Statistical Methods and Deep Learning Models

Project Overview

This project explored various statistical methods and deep learning models for multivariate time series analysis. Techniques such as Naive Forecasting, Moving Average Forecasting, Differenced Moving Average Forecasting, and Differenced Moving Average Forecasting with Smoothing were meticulously examined. Within the realm of deep learning, Simple Neural Networks, Deep Neural Networks, Single-Layer LSTMs, Single-Layer Regularized LSTMs, Bi-Directional Regularized LSTMs, Regularized Stacked GRUs, and Convolutional Layers with Stacked GRUs and Fully Connected Layers were analyzed. Through rigorous comparison and evaluation, the most effective methodology for achieving accurate and reliable weather predictions were sought. This involved establishing baseline, selecting the best model using learning rate scheduler, and conducting performance comparisons against baseline.

Statistical Analysis of Variables

Univariate Analysis

In this phase, individual variables are analyzed to understand their distribution and normality. Utilizing histograms and quantile-quantile (qq) plots, we gain insights into their characteristics.

Histograms
histograms
Quantile-Quantile Plots
qq-plots

Correlation Analysis

Exploring relationships between variables, correlation analysis employs Pearson correlation coefficients. A correlation matrix visualized through a heatmap highlights the strengths of correlations with 'T (degC)', offering valuable insights into inter-variable relationships and dependencies.

Heatmap of Correlation Coefficients
heatmap

Data Visualization

Time series plots depict temperature variations over time, revealing both long-term trends and short-term fluctuations within seasonal cycles. Annual temperature trend analysis showcases maximum, average, and minimum temperatures annually, aiding in the interpretation of climate data and identification of seasonal patterns.

Seasonality
seasonality
Seasonality without Noise
seasonality without noise
Seasonality (First Season Cycle)
first season cycle
Temperature over the Years
temperature over the years

Statistical Forecast Methods

Fixed Partitioning for Statistical Methods based Forecasting

A systematic partitioning approach divides temperature data into training and testing sets. Data from 2012 to 2014 are allocated for training to enable model learning from historical data, while data from subsequent years are reserved for validation and testing, ensuring accurate predictions of future temperatures.

Naive Forecast

Predictions are based solely on the last observed temperature, serving as a baseline for accuracy assessment.

naive forecast

Moving Average Forecasting

Average temperatures over defined window sizes are computed to smooth short-term fluctuations and highlight long-term trends.

moving average

Differenced Moving Average Forecast

By differencing to remove trends and seasonality before applying a moving average, this method refines predictions and improves accuracy.

Differenced Moving Average

Differenced Moving Average Forecast with Trend & Seasonality Added

Seasonality and Trend added back to the differenced moving average.

Differenced Moving Average with Trend & Seasonality Added

Differenced Moving Average Forecast with Smoothing

Using centered approach to smooth the data at each step. For instance, to smooth the data point at t = 365, we would compute the average of the values from t = 359 to t = 370, with the window size of 11.

Differenced Moving Average with Smoothing

Deep Learning Models

Various deep learning models, including Basic Neural Network, Deep Neural Network, LSTM, Regularized LSTM, Bi-Directional LSTM, Stacked GRUs, and Convolutional layer with stacked GRUs and Fully Connected Layers are explored for temperature forecasting, each tailored to leverage sequential data characteristics for enhanced prediction accuracy.

Fixed Partitioning for Neural Network based Forecasting

Temperature data are split into training, validation, and testing sets ensuring chronological order and accounting for seasonality, essential for effective model training and evaluation.

Data Preprocessing

Data normalization using MinMaxScaler ensures consistent scaling, particularly beneficial for non-normally distributed data and when training neural networks with features of different scales.

Sequence Generation

Sequences are generated from input array data using TensorFlow's timeseries_dataset_from_array, facilitating training, validation, and testing of models with specified sequence lengths.

Model Finalization

The two most promising models, determined by their low loss and Mean Absolute Error (MAE), were selected for further refinement. They underwent fine-tuning using a learning rate schedule to identify the optimal learning rate and were then retrained on the dataset. Among these models, the one exhibiting the best performance with the new learning rate was chosen as the final selection.

Training Loss versus Learning Rate for Model 1
Training Loss versus Learning Rate for Model 1
Training Loss versus Learning Rate for Model 2
Training Loss versus Learning Rate for Model 2

Evaluation

Model performance is evaluated using Mean Absolute Error (MAE) metric on the test dataset, comparing predictions against actual values to quantify forecasting accuracy.

Weather Forecast

Trained models are utilized to predict future temperature values, leveraging the learned patterns and dependencies in the data to provide accurate forecasts.

Potential Improvements

  • Modify the number of units.
  • Change the dropout ratio.
  • Test different learning rates.
  • Experiment with batch sizes.
  • Add more dense layers.
  • Alter the sequence length.

Data Sources:

https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip

License:

This project is licensed under the Raza Mehar License. See the LICENSE.md file for details.

Contact:

For any questions or clarifications, please contact Raza Mehar at [[email protected]].

About

This project conducts a thorough analysis of weather time series data using diverse statistical and deep learning models. Each model was rigorously applied to the same weather time series data to assess and compare their forecasting accuracy. Detailed results and analyses are provided to delineate the strengths and weaknesses of each approach.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published