This project showcases a robust AI/ML solution developed for the Ethiopian Meteorology Institute during my tenure at the Ethiopian Artificial Intelligence Institute. The objective was to significantly enhance the accuracy and efficiency of weather forecasting across Ethiopia, providing critical insights for various sectors including agriculture and disaster preparedness.
Project Highlights:
- Extensive Data Utilization: Leveraged 65 years of historical weather data collected from 22 distinct stations across Ethiopia. The dataset included various meteorological parameters such as relative humidity, maximum temperature, minimum temperature, precipitation, and sunshine hours.
- Comprehensive Machine Learning Pipeline: Implemented a standard machine learning workflow, encompassing:
- Raw Data Preparation: Converted raw data into daily time-series, handling averaging of features and correcting for missing dates by filling with NaN values.
- Missing Value Imputation: Employed back and forward fill techniques for missing values, and strategically dropped features like wind speed due to high missing data percentages (21.45%) to maintain model integrity.
- Exploratory Data Analysis (EDA): Conducted in-depth EDA to understand data patterns, seasonality, and distributions of weather variables such as precipitation, minimum/maximum temperature, relative humidity, and sunshine hours. Identified high correlations between relative humidity, maximum temperature, and sunshine hours.
- Data Preprocessing: Performed white noise checks to confirm predictability of temperature data and utilized SVM for outlier detection and cleaning to ensure data quality. Data smoothing was applied using moving averages.
- Model Training and Evaluation: Developed and trained advanced deep learning models, including Stacked LSTM, BiLSTM, and ConvLSTM, for both univariate and multivariate time series predictions. The models were rigorously evaluated using metrics such as RMSE, MAE, R2 score, and Bias.
- Impactful Results: Achieved high R2 scores for temperature predictions across multiple stations (e.g., Jimma: 0.96 for min/max temperature; Robe: 0.97 for min temperature, 0.96 for max temperature), demonstrating strong predictive accuracy. While precipitation proved to be a random walk signal and thus not predictable, the temperature models showed excellent performance. Forecasts were visualized geographically to provide actionable insights for specific dates.
This project highlights my expertise in applied machine learning, time series analysis, data preprocessing, and delivering impactful AI solutions in a real-world governmental context.


