NASA satellite datasets are used in a solution of up-to-date scientific problem of long-range weather forecast. This approach uses different places datasets as arguments in linear forecasting model (average daily air temperature example). It includes three stages – preliminary correlation analysis to decrease computational complexity and identify dependencies, inductive modeling, web software development (sofa.azurewebsites.net). Forecast accuracy is greater than 90 % at the half-year lead-time.

This project is solving the We Love Data challenge.


Description

Since El Niño-Southern Oscillation was taken into close consideration by meteorologists in sixties last century, very small amount of scientific work was dedicated to the climate models’ design using global datasets and their cross-impact because of the huge lack of the global datasets, and significant computational and structural complexity of models’ reasoning. Nowadays, scientists are able to use NASA satellite datasets to solve first problem (in this project, average daily air temperature is an example). Second problem hasn’t single solution. Inductive modeling is proposed to use. The preliminary correlation analysis (Pearson product-moment correlation coefficient - PPMCC) decreases computational complexity – outcomes are dependencies among different places from around the world (66 are taken into consideration subjectively). Three top ranked places are considered as arguments of forecasting model. It was shown that some places have dominant impact on others – 50 correlation results (out of 66) include two datasets from Beijing (China) and Ulaanbaatar (Mongolia); 40 of them have the PPMCC greater than 0.8 in absolute value. Inductive modeling’s outcomes are linear models using appropriate places’ datasets with different delays as arguments. Finally, web based software (http://sofa.azurewebsites.net/) was developed using ASP.NET technology and Adobe Photoshop for 2D map creation. It allows to users to drill down to related region and see the above dependencies and long-range forecast. Average daily air temperature long-range forecast is illustrated by Washington National Airport, USA (average mistake is up to 6.7 %; 173 days lead-time), Skopje Airport (average mistake is up to 6.2 %; 166 days lead time), and Kiev, Ukraine (average mistake is up to 8 %; 166 days lead time). The future research’s main prospects are next: farther computational complexity’s decrease; extension of the used datasets – for instance, other places plus sunspots, polar cap magnetic activity index, sea level, etc. More detailed description can be found against the link https://www.dropbox.com/s/o425552zkarydjp/Presentation.zip



Project Information

License: LGPL
Source Code/Project URL: https://github.com/Zubov/NASA.git

Resources

Presentation of Main Ideas and Results - https://www.dropbox.com/s/o425552zkarydjp/Presentation.zip
ASP.NET Web-Site's Codes - https://www.dropbox.com/s/cqolfqhnbgn4659/Final_SOFA.zip
Correlation Analysis of Time Series Using Pearson Product-Moment Correlation Coefficient - https://www.dropbox.com/s/jjk0nn75kqsr9zn/Correlation%20Analysis.zip
Washington Inductive Modelling Example - https://www.dropbox.com/s/6ojhc6t1fbw4ddy/Washington%20Inductive%20Modelling%20Example.zip
Datasets - https://www.dropbox.com/s/wctipzg6guwhbmq/Datasets.xlsx
Correlation Analysis's Summary Table - https://www.dropbox.com/s/j6e5axwdgvc6x5j/Correlation%20Analysis%27s%20Summary%20Table.doc
App Website - http://sofa.azurewebsites.net/