Pollution level forecasting with multivariate time series


Question: Beijing and the surrounding territories has experienced chronic air pollution in which the main pollutants are fine particulate matter. This matter is known to influence visibility, human health and especially climate. Is thier a way to predict pollution levels in Beijing?

Proposition: I utilize a dataset consisting of the measured air pollution concentration (a time series dataset) with other (dependable) air characteristics (e.g., temperature and pressure) to conduct a multivariate time series modeling, with the Long Short Term Memory (LSTM) model, to forecast future air pollution levels.

End-product: While the dataset I used is outdated by a decade, I was able to predict the peaks and features of the air pollution concentration levels overtime sufficiently. If the methodology of collecting the data for air pollution concentration were to be done measuring up to present day's levels, the tools that I utilized for can be easily employed.

[Skills]: Sequential Data, Deep Learning, Recurrent Neural Networks, Long Short Term Memory Model, Pandas, PyTorch, scikit-learn
github
transfer





 

Fantasy book rankings and trends based off of Goodreads user data


Question:

Proposition:

End-product:

[Skills]: Tableau and Data Visualization
github




 

UFO abductions? Relations between UFO sightings and missing persons


[Coming soon]

[Skills]: Geographical and Datetime Data, Statistical Inference, Pandas, GeoPandas
github