Skip to content

Model evaluation

General description#

The following evaluation metrics are options that we have considered. However, to judge the efficacy of our models, we have decided on primarily using the mean absolute percentage error (MAPE) and median absolute percentage error (MdAPE). These evaluation metrics are best suited for our particular dataset due to property prices having a wider range of values, i.e. a percentage error indicates a similar proportion of impact be it for a $100,000 or a $10,000,000 property. This is crucial as the resale price has a heavy right-tailed distribution (see the figure below; also available in the EDA report) due to some properties having very high transaction prices. Furthermore, both the MAPE and the MdAPE are more easily understandable metrics as they are described in terms of percentages.

Price histogram#

Below is a histogram of resale property prices, which indicates a heavy right-tailed distribution.

price-freq-hist.jpg

Primary model evaluations#

In this section, we shall detail the performance metrics of the models that our team has created. Note that since this is only a primary stage of evaluation, we only compare the different models based on the five metrics that we have chosen, without any accompanying data visualisations.

Biased model results#

Please visit this page to view the performance metrics of our models that have been trained using an arbitrary train-test split, i.e., the results of models tainted with lookahead bias. For more details this issue and our solution, please visit this page.

A notable result is that of kNN regressor, where the MAPE was close to 0%, indicating that there must be some serious issue either with overfitting or with the training data.

Unbiased model results#

Please visit this page to view the performance metrics of our models trained after fixing the lookahead bias.

As we can see, the best models are (in order of increasing MdAPE) rolling bagging regressor, rolling kNN regressor, and rolling XGBoost regressor. Hence we shall choose these three models for further analysis below.

Description of selected models#

Rolling bagging regressor#

The first of our models is the rolling bagging regressor. This model is an ensemble which fits a base regressor on random subsets of the original dataset before aggregating their individual predictions (either by voting or by averaging) to form a final prediction. The base regressor used was the default decision tree regressor. However, when used alone, the decision tree regressor results in high variance in the data. Hence, we chose to use an ensemble of decision trees to reduce the variance associated through randomisation and aggregation of results.

Overall, the rolling bagging regressor performed the best out of the three models we chose, with an MAPE of 6.30% and a MdAPE of 4.71%.

Rolling kNN regressor#

The second selected model is the rolling kNN Regressor. This model averages the prices of the k nearest neighbours of every target to provide a prediction for the target’s price.

The rolling kNN regressor model was chosen because of its simplicity as well as its stellar results. The rolling kNN regressor performed the best out of the three models chosen, with an MAPE of 6.41% and MdAPE of 4.73%.

Rolling XGBoost regressor#

Our final model is the XGBoost regressor. This model is also an ensemble which uses decision trees as the base regressor on random subsets before aggregating individual predictions to form a final prediction. However, the key differentiator of the XGBoost regressor is that the XGBoost algorithm utilises an optimised version of gradient boosting. Boosting, in this sense, involves building base estimators sequentially with the aim of reducing the bias of the combined estimator. Gradient boosting expands on this by minimising the loss function of particular learners within the model. XGBoost then further optimises on gradient boosting by using L1 and L2 regularisation, which prevents overfitting and improves generalisation capabilities.

As expected, the rolling XGBoost regressor also performed superbly, with an MAPE of 6.50% and a MdAPE of 4.83%, which is slightly worse than the other two models we chose.

Extended model evaluations#

In this section, we shall detail the three models that we have selected, namely rolling bagging regressor, rolling kNN regressor, and rolling XGBoost regressor in more depth through the use of data visualisations on the models' predictions.

All models perform poorly at the start of the period (which is in 1991) according to the MAPE and MdAPE metrics due to a lack of data from previous time intervals. In several towns, most notably Sembawang, the models perform exceptionally poorly across certain time periods due to sparsity of data in those time periods. For example, the number of data points each year for Sembawang is less than 10 from 1990 to 2001, with many years having only 2 data points. This lack of transaction data can be seen with this visualisation.

It can also be noted that across all chosen models, the mean absolute error is the highest for towns near the southern region of central Singapore, such as Bukit Merah (7.19/7.55/7.51), Queenstown (7.49/7.90/7.70) and Kallang (7.38/7.56/7.68) (from left to right: rolling bagging regressor, rolling XGBoost regressor, rolling kNN regressor). This is most likely due to the smaller number of datapoints for towns in said region across the entire timeframe, as indicated by the size of the bubbles in this figure.

We list out the visualisations that support the above points below:

Rolling bagging regressor Rolling kNN regressor Rolling XGBoost regressor
Per-HDB scatter mapbox of MAPEs baggingregressor-lat-lon-fig knnregressor-lat-lon-fig xgboost-lat-lon-fig
Per-town scatter mapbox of MAPEs baggingregressor-town-fig knnregressor-town-fig xgboost-town-fig
Per-town scatter mapbox of MdAPEs baggingregressor-town-fig-med knnregressor-town-fig-med xgboost-town-fig-med
Per-town yearly scatter mapbox of MAPEs baggingregressor-time-town-fig knnregressor-time-town-fig xgboost-time-town-fig
Per-town yearly line plot of MAPEs baggingregressor-mape-time-town knnregressor-mape-time-town xgboost-mape-time-town
Per-town yearly line plot of MdAPEs baggingregressor-mdape-time-town knnregressor-mdape-time-town xgboost-mdape-time-town
Per-town yearly boxplot of MAPEs baggingregressor-mape-boxplot-town knnregressor-mape-boxplot-town xgboost-mape-boxplot-town
Heatmap of features (actual = target variable, pred = prediction) baggingregressor-heatmap knnregressor-heatmap xgboost-heatmap
Per-town scatterplot of MAPEs (red line is overall model MAPE) baggingregressor-town-mape knnregressor-town-mape xgboost-town-mape