The housing market is booming and so is the number of things we now consider before buying a house. From things like how old the house is, to neighborhood and even what direction the house is facing, we want to consider it all. The motivation behind this project is to visualize some of these factors and how they have changed. Futhermore, train a model to predict the price of a house based on these features.
Real-estate seems to be really hot currently, with a lot of people looking to make the most of the low interest rates to buy their dream house or invest for the future.
The housing market has been really hot the past few years due to the low interest rates. In this project a supervised machine learning model is trained to predict the house prices in Boston based on parameters such as number of rooms, area, year built, etc.
I have used Tableau for the exploratory analysis and ensembling techniques to predict the final prices.
The datasets used for this part of the project can be found on Kaggle:
https://www.kaggle.com/c/house-prices-advanced-regression-techniques
Tableau is a widely used software for data analysis, visualization and preparing dashboards. Data exploration for this project was conducted throught Tableau.
It is typically percieved that houses built in the earlier years had more rooms and were more spacious. This myth is busted as shown by the box-plots below.
Sale price and building foundation do not seem to be too correlated, increasing lot size does increase the sale price as depicted by bigger circles.
Did a particular type of roof become old fashioned? Or did a certain style of house come in fashion? As seen from the plot below, Split Foyer and Split Level became more popular around the 1960's. Gable roof type is the most common across different years and house styles.
We are all aware the historically, house prices have been going up. Lets instead try and compare the difference in prices of old and new houses and the condition they are sold in. It is interesting to note that there was a slight dip in the prices for houses built and sold during the economic crisis in 2008.
The following regression models are used:
The resuls obtained are then plotted and best 3 models are using to create and ensemble which is used in making predictions.
A prediction RMSE of 0.133 is obtained. This can be further improved by conducting hyperparamter tuning and uisng more models in the ensemble.