Data Cleaning and Exploration: Ames Housing Data Set

Analysis and Prediction of Home Prices

The Ames, Iowa housing dataset is a comprehensive listing of individual residential properties sold in the city from 2006 to 2010. With over 80 columns of raw, unformatted data, the Ames housing dataset demands a substantial amount of cleaning before it can be used for modeling.

Although this project is an end-to-end regression analysis of home prices, I think the most interesting insights emerge during the cleaning and exploratory analysis phases. While I am hardly the first to experiment with this dataset, the sheer volume of possible decisions one can make while preparing the data for modeling ensures that no two outcomes will ever be alike.