A Machine Learning Approach to House Price Indexes
The existing approaches for generating house price indexes (HPIs) are almost exclusively found in the realm of traditional statistical modeling. This paper offers a new approach using a machine learning model class – random forests – combined with a model-agnostic interpretability method – partial dependency – to derive an HPI. After providing an example of this approach, I then test the Interpretable Random Forest (IRF) against indexes derived from repeat sales and hedonic price models. Using data from the City of Seattle, this comparison suggests that the IRF is competitive (and occasionally superior to) popular existing methods across measures of accuracy, volatility and revision at both a city-wide and a neighborhood scale.
The working paper is here
Hierarchical Blending: House Price Indexes for Submarkets
Traditionally, literature on house price indexes has focused on the choice of method (repeat sales vs. hedonic vs. hybrid, etc.) or correcting for biases resulting from issues such as sample selection, aggregation, or temporal heterogeneity. While more recent research has explored the production of indexes at the sub-city or submarket level there remains, however, a lack of comparative analytics to determine the limits to market disaggregation in terms of generating reliable price indexes. The goals of this study are two-fold: 1) Compare index performance at different sub-market levels (county, ZIP, etc.); and 2) To test if blended indexes – between market levels and between methods – offer any improvement over more traditional non-blended models.
A draft copy of the paper can be found here
Real Estate Analysis in the Information Age with K. Winson-Geideman, C. Lipscomb & N. Evangelopoulos. Routledge, ISBN-13: 9781138232907. Publisher Page : Amazon Page
An Analysis of Household Location Choice in Major U.S. Metropolitan Areas using R, in The Practice of Reproducible Research, eds. Kitzes, J., Turek, D. & Deniz, F. University of California Press, Co-Authored with Estiri, H. ISBN: 9780520294752 Website : Publisher Page
Krause, A. and G. Aschwanden. To Airbnb? Factors Impacting Short-Term Leasing Preference. Journal of Real Estate Research. Fall 2020. Access via DOI.
The growth of Airbnb and other short-term rental platforms have presented absentee owners of urban residential properties with a choice of leasing strategy: traditional long-term rental or a short-term approach, known as “Airbnb-ing.” In this paper we identify those situations—location, structure type, and property characteristics-that lead to the highest likelihood of favoring a short-term strategy over a long-term one. Additionally, we test the impacts of hosting policies the results of which suggest that even the right property may need the right owner(s) or strategy to make short-term rental the more profitable approach.
The pre-print working paper is here
Krause, A., A. Martin, M. Fix. Uncertainty in Automated Valuation Models: Error- vs Model-Based Approaches. Journal of Property Research. Forthcoming. Access via DOI.
Point estimates from Automated Valuation Models (AVMs) represent the most likely value from a distribution of possible values. The uncertainty in the point estimate – the width of the range of possible values at a given level of confidence – is a critical piece of the AVM output, especially in collateral and transactional situations. Estimating AVM uncertainty, however, remains highly unstandardised in both terminology and methods. In this paper we present and compare two of the most common approaches to estimating AVM uncertainty – model-based and error-based prediction intervals. We also present a uniform language and framework for evaluating the calibration and efficiency of uncertainty estimates. Based on empirical tests on a large, longitudinal dataset of home sales, we show that model-based approaches outperform error-based ones in all but cases with very highest confidence level requirements. The differences between the two methods are conditioned on model class, geographic data partitions and data filtering conditions.
The pre-print working paper is here
Warren-Myers, G., G. Aschwanden, F. Fuerst and A. Krause. Estimating the Potential Risks of Sea Level Rise for Public and Private Property Ownership, Occupation and Management. Risks, 6(2), 37. Access via DOI.
The estimation of future sea level rise (SLR) is a major concern for cities near coastlines and river systems. Despite this, current modelling underestimates the future risks of SLR to property. Direct risks posed to property include inundation, loss of physical property and associated economic and social costs. It is also crucial to consider the risks that emerge from scenarios after SLR. These may produce one-off or periodic events that will inflict physical, economic and social implications, and direct, indirect and consequential losses. Using a case study approach, this paper combines various forms of data to examine the implications of future SLR to further understand the potential risks. The research indicates that the financial implications for local government will be loss of rates associated with total property loss and declines in value. The challenges identified are not specific to this research. Other municipalities worldwide experience similar barriers (i.e., financial implications, coastal planning predicaments, data paucity, knowledge and capacity, and legal and political challenges). This research highlights the need for private and public stakeholders to co-develop and implement strategies to mitigate and adapt property to withstand the future challenges of climate change and SLR.
Estiri, H. and A. Krause. A Cohort Location Model of Household Sorting in US Metropolitan Regions. Urban Studies, 55(1), 71-90. Access via DOI.
In this paper we propose a household sorting model for the 50 largest US metropolitan regions and evaluate the model using 2010 Census data. To approximate residential locations for household cohorts, we specify a Cohort Location Model (CLM) built upon two principle assumptions about housing consumption and metropolitan development/land use patterns. According to our model, the expected distance from the household’s residential location to the city centre(s) increases with the age of the householder (as a proxy for changes in housing career over life span). The CLM provides a flexible housing-based explanation for household sorting patterns in US metropolitan regions. Results from our analysis on US metropolitan regions show that households headed by individuals under the age of 35 are the most common cohort in centrally located areas. We also found that households over 35 are most prevalent in peripheral locations, but their sorting was not statistically different across space.
Lipscomb, C., Youtie, J., Shapiro, P., Arora, S. & Krause, A. Evaluating the Impact of Manufacturing Extension Services on Establishment Performance. Economic Development Quarterly, 32(1), 29-43. Access via DOI
This study examines the effects of receipt of business assistance services from the Manufacturing Extension Partnership (MEP) on manufacturing establishment performance. The results generally indicate that MEP services have had positive and significant impacts on establishment productivity and sales per worker for the 2002 to 2007 period with some exceptions based on employment size, industry, and type of service provided. MEP services have also increased the probability of establishment survival for the 1997 to 2007 period. Regardless of econometric model specification, MEP clients with 1 to 19 employees have statistically significant and higher levels of labor productivity growth. The authors also observed significant productivity differences associated with MEP services by broad sector, with higher impacts over the 2002 to 2007 period in the durable goods manufacturing sector. The study further finds that establishments receiving MEP assistance are more likely to survive than those that do not receive MEP assistance.
Winson-Geideman, K., A. Krause, G. Warren-Myers, and H. Wu. Non-spatial Contagion in Real Estate Markets: The Case of Brookland Greens. Journal of Sustainable Real Estate, 9(1), 22-45. Access via DOI
We investigate contagion in real estate markets by evaluating the effects of a widely publicized landfill contamination event in one local market on the price of homes near landfills in non-impacted markets within the same metropolitan region. The impact of proximity to open, closed, and redeveloped landfills in the directly affected and contagion neighborhoods is tested at distances varying from <500 meters to 2,500 meters using the traditional hedonic pricing model. The results are mixed and relative to the current use of the landfill as closed, capped, and redeveloped landfills show no impact. Sites that are capped yet undeveloped and sites with open fills appear to show some impact, although further research is needed to support any contagion effects.
Bitter, C. and A. Krause. The Influence of Urban Design Packages on Home Values. International Journal of Housing Market Analysis, 10(2), 184-203. Access via DOI
The purpose of this study is to examine the impact of neighborhood design templates on residential home values in King County, WA, USA. Previous research examines a number of individual design factors; this study combines these factors into typologies and tests for the impacts of the composite set of design features.
The study analyzes over 27,000 home sales with a hedonic price model to measure the impacts across three large, regional submarkets. Neighborhood design categories are developed using a cluster analysis on a set of individual neighborhood attributes. The key finding from this research is that the impact of more traditional (“urban”) design packages on home values is highly contextual. For the older and denser neighborhoods in the study area, a more traditional design results in a significantly positive impact on home values. In the new and more suburban regions of the study area, this effect is not found.
Prior work focused on valuing design attributes individually. The study argues that neighborhood design is better conceived of as a “package”, as the value of a given design element may depend on other co-located attributes. This is the first study, to the authors’ knowledge, to treat physical neighborhood design variables as a composite whole and to attempt to value their impact on home values as such.
Krause, A. and C. Lipscomb. The Data Preparation Process in Real Estate: Guidance and Review, Journal of Real Estate Practice and Education, 19(1), 15-42. Access via DOI
Very little discussion in the real estate literature or the classroom is given to acquiring, managing, cleaning, and preparing large datasets (collectively, the data preparation process). In this paper, we examine the general state of real estate data, the research on data preparation and provide common examples of issues encountered while working with property-level data. We also examine the characteristics that make working with real estate data highly unique and occasionally very difficult: it is largely un-standardized, often lacks sufficient labeling, and is spatial and temporal in nature. We conclude by examining a sample of published research from the Journal of Real Estate Research, Real Estate Economics, and the Journal of Real Estate Finance and Economics to gauge how documentation of the data preparation process in the peer-reviewed literature has changed over time.
Krause, A. Reproducible Research in Real Estate: A Review and an Example, Journal of Real Estate Practice and Education, 19(1), 69-85. Access via DOI.
The practice of reproducible research, a central component of the burgeoning “open science” movement, has been thrust into the public spotlight over the past few years. In this paper, I offer an overview of reproducibility in science, review specific concerns for the real estate field, and survey the current policy regarding reproducibility among top real estate journals. Performing research reproducibly requires a change from the status quo and represents an educational issue. Toward that end, I demonstrate reproducible research via a fully documented and freely-available example of a reproducible hedonic price analysis complete with all data, code, and results hosted online.
Sim, E., A. Krause, and K. Winson-Geideman. The Impact of Transit Oriented Development (TOD) on Residential Property Prices: The Case of Box Hill, Melbourne. Pacific Rim Property Research Journal, 21(3), 199-214. Access via DOI
Transit-oriented design (TOD) – an increase in density around transit stations – has arisen in many of Australia’s capital cities as a way to encourage mass transit ridership as well as to efficiently utilize the increase in foot and vehicle traffic that transit stations create. However, the implementation of TODs in Melbourne has faced strong opposition due to residents’ perception that the disamenities of a TOD will outweigh the benefits resulting in negative impacts on property prices. This research analyzes the relationship between proximity to a TOD and residential home prices. Results indicate that proximity to a TOD is positively related to property prices, even after controlling for neighborhood factors such as street connectivity and overall land use mix. By testing a variety of transformations of distance, we find that the benefits of TOD proximity extend approximately 1250 m from the Box Hill station. From a methodological standpoint, we find that more flexible treatments of distance variables in spatial autoregressive and spline models produce better model fit and lead to results more in line with urban economic theory.
Krause, A. Piece-by-Piece: Low-rise Redevelopment in Seattle, Journal of Property Research, 32(3), 258-278. Access via DOI
The redevelopment of land containing single-family detached dwellings into small attached or multiple-family structures is a common method of densification in existing urban areas. The potential for redevelopment of any existing home is an important consideration for housing market participants, real estate developers and public officials. Using a longitudinal data-set from the City of Seattle, this study quantifies the impact that a number of factors – policy, physical, neighbourhood and market – have on the likelihood of this form of land use conversion. Derived with a duration model, these findings suggest that the size of the existing home, the adjacent land uses and, most importantly, factors affecting the size of the potential redevelopment have the largest impact on the probability of redevelopment.
Estiri, H, A. Krause & M. Heris. “Phasic” metropolitan settlers: a phase-based model for the distribution of households in US metropolitan regions, Urban Geography, 36(5), 777-794.Access via DOI
In this article, we develop a model for explaining spatial patterns in the distribution of households across metropolitan regions in the United States. First, we use housing consumption and residential mobility theories to construct a hypothetical probability distribution function for the consumption of housing services across three phases of household life span. We then hypothesize a second probability distribution function for the offering of housing services based on the distance from city center(s) at the metropolitan scale. Intersecting the two hypothetical probability functions, we develop a phase-based model for the distribution of households in US metropolitan regions. We argue that phase one households (young adults) are more likely to reside in central city locations, whereas phase two and three households are more likely to select suburban locations, due to their respective housing consumption behaviors. We provide empirical validation of our theoretical model with the data from the 2010 US Census for 35 large metropolitan regions.
Krause, A. & C. Bitter. Spatial Econometrics, Land Values and Sustainability: Trends in Real Estate Valuation, Cities, Special Issue: Current Research on Cities, 29(S1), S19-S25. Access via DOI
In the aftermath of the recent boom and bust of US real estate, both a refinement and a deeper understanding of real estate valuation methods have become critical concerns across a number of broad urban-related academic fields. Out of this we see three major trends in the field of real estate valuation research: (1) the expansion of spatial econometrics; (2) the recognition of the differences between land values and improvement values; and (3) acknowledgment of value premiums stemming from more sustainable forms of development. This paper offers a brief summary of the latest work in these emerging areas of academic valuation research.
Krause, A., R. Throupe, J. Kilpatrick, & W. Spiess. Contaminated Properties, Trespass, and Underground Rents, Journal of Property Investment & Finance 30(3), 304-320. Access via DOI
This paper seeks to extend the literature on property damage assessment by incorporating the right of exclusion as a compensable component to damages. The paper aims to go on to illustrate methodologies to estimate as a rent this damage component. The authors develop a conceptual framework from which to examine the value of underground storage space with special reference to situations in which migrating contamination from commercial operations have invaded private real property. Specifically they view this invasion as a compensable violation of the right of exclusion. This underground storage analysis uses the three approaches common to traditional appraisal (income, sales and cost) to estimate the value of underground storage caused by migrating contamination. Conceptually the paper finds that underground storage can be easily valued with existing appraisal methods. Using contamination scenarios paired with actual market data from the South‐Eastern USA, the paper shows an example of each of the three methods for valuation. It concludes by reconciling the estimated values and supply additional issues to consider when valuing underground storage. Contaminated properties analysis and damages have focused on the right of transfer when estimating damages to real property. Other portions of the bundle of rights also require examination. This is the first discussion of underground trespass in relation to contaminated property coupled with an empirical example to address the right of exclusion and estimated rents due for use of adjacent properties as a storage facility.
Krause, A. & M. Kummerow. An Iterative Approach to Minimizing Valuation Errors using an Automated Comparable Sales Model, Journal of Property Tax Assessment & Administration 8(2), 39-52.
This paper describes a method for automating sales comparison valuations by choosing a small sample of comparable sales from a submarket of similar properties and adjusting their prices based on differences between sale and subject property characteristics. This logic is similar to that used in a traditional sales comparison adjustment grid approach using, for example, FNMA Form 1004. Traditional appraisal methods select, adjust, and reconcile a few comparable sales. This valuation algorithm follows the same steps and, in addition, computes summary price prediction error statistics useful for evaluating and improving the valuation protocol.
Kilpatrick, J., J. Carruthers, R. Throupe & A. Krause. The Impact of Transit Corridors on Residential Property Values, Journal of Real Estate Literature 29(3), 303-320. Access via DOI
Most of the literature on transit corridors, such as superhighways and tunnels, focuses on the positive externality of transit access (e.g., interstate access, transit station) and fails to isolate the negative externality of the corridor itself. This empirical study examines two situations: one with both access benefits and negatives, and another without the access benefit. The findings reveal that proximity to the transit corridor alone without direct access conveys a negative impact on nearby housing values.
The AirBNB market in Melbourne
This research presents a `census’ of the current state of Airbnb in the Melbourne metropolitan region. We document and analyse trends for the 22-month time period from October 2014 through August 2016. The data that we analyse has been obtained from www.airdna.co, a data provider specializing in Airbnb data collection and analysis. The data includes information on each property – such as location, bedrooms, bathrooms, first date on the website, etc. – as well as information on the daily booking status and price of each property for each day since October 2014. A draft copy of the paper can be found here
Rental Yields in Melbourne Rental yields (rent-to-price ratios) are an important metric in judging both the viability of a housing investment as well as the overall health of the market. In this study, we use the property level rental yield calculations made in an earlier work to analyse the variation in the Melbourne market. This study is focused on market transactions that occurred in the year 2015. here
Rental Yield Calculation Methods In studies of residential real estate markets the relationship between rents and prices are an oft-used indicator of both investment potential as well as bubble identification. Most often, this relationships (referred to as rent-price ratio, price-rent ratio or rental yield) are constructed via simple metrics such as medians or means using spatially aggregated data. In such cases, inconsistencies between the types of properties that rent and sell as well as the constant quality differences between similar sales and rentals of a given property type may lead to biased measures. In this study we use a unique set of transaction-level observations - residential sales and rentals - in the Melbourne Metropolitan Area (Australia) to test for differences in the measured relationship between rents and prices using four different estimation techniques. More specifically we compare relatively simple spatial aggregation and index methods to a hedonic regression imputation method and a direct matching method. Our results show that, in general, the rent-price ratios (rental yields) suggested by the median and index methods are biased on the low side (up to 20%) when compared with direct matches of properties that have sold and rented (or vice versa) within a short time period. These biases have increased over the past I’ve years suggesting that the mix of rental and sold units has changed during this period. Additionally, in a test to determine which method offers the best predictive accuracy of future sales prices and rental rates, the direct match method fares the best, followed closely by hedonic imputation. here
by Andy Krause
andy@andykrause.com