This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The service was updated to include apartment price data and predictions up to 2020. Data fetching queries were fixed to match pxweb's new api and minor refactoring was done. Data and sources are now in update_2021
folder.
The model was revised quite thoroughly to include yearly varying coefficients for demographic covariates from StatFi. The temporal part of the model is no more fixed to be quadratic but is essentially nonparametric thanks to the yearly coefficients. Instead of forecasting future prices, which is difficult due to unpredictable real world events, the focus is now on nowcasting present and past prices. The blog post at Reaktor's blog as well as the actual source code describe the updated model in more detail. Last, a facelift was given to the interactive visualisation.
Apartment price trends across Finland, on the level of zip codes, based on open data from Tilastokeskus. See the interactive visualisation Kannattaakokauppa and related blog post at Reaktor as well as older ones in rOpenGov, and Louhos.
Discussion on Hacker News.
Also news coverage at Talouselämä, Kauppalehti and Kaleva.
Apartment price data for the postal codes is from Statistics Finland open data API (see Terms of Use). Postal code region names, municipalities, population data, and map are from Statistics Finland Paavo - Open data by postal code area. Map has been simplified by removing small islands and by reducing the amount of corners of the polygons.
The data sets are accessed with the pxweb and geofi package from rOpenGov. See the script source/get_data.R
for details.
See the update_2021/source
-folder for latest source code.
See description in English in Reaktor-blog. There are three models in the source folder: base, factorial, and nominal. The first one is the original one from 2015 and the second one the original with a factorial covarite instead of population density. The nominal then is the one described in the blog and used to obtain the latest results.
Whichever model you want to use, run the scripts in the following order:
get_data.R
run_XX_model.R
postprocess_xx_model.R
result_analysis.R
The latest update was quite big. Development possibilities would be to at least model prices of apartments with different room numbers separately with something hierarchical since the data is already available from StatFi. Another obvious possibility is to further study the effects of the covariates and perhaps introduce new ones from Paavo or from other open data sources.
The codebase should also be refactored properly.