Zillow's Forecasting Model: Zestimate

I’ll be discussing the question, will the coronavirus cause a crash in housing market. More specifically, the King County area which houses two major cities, Seattle and Bellevue. Judging from the news going around, what one would expect to happen is actually happening. Given the declining stock market and other concerns from buyers and sellers in regards to what’s going on in the stock market and how it’s going to impact the housing market, there appears to be certainly a decline in buyer’s interest. There also appears to be low inventory and especially with the governor’s stay home order, most buying and selling activities have been drastically slowed down if not shut down. There’s still little buying and selling activities but on the flip side, we don’t know how bad the stock market is going to get. We don’t we don’t know how the stock market is going to affect the housing market. In some cases when the stock market goes into a recession or has a rough spell the housing market actually goes up but in other times it goes down. The most recent one made a massive correction in the stock market and in the housing market at the same time and it was mainly the real estate markets fault so we really have any clue how the stock market and housing market is going to turn out.

Other unknowns are; we don’t know how long the lockdown is going to last for, we don’t know how long Seattle is going to be on lockdown we don’t know how long California New York and some of the other states are going to be on lock down and we also don’t know how many other states are going to be unlocked also don’t know how big this pandemic is going to get so there’s a lot of uncertainty around the COVID-19 virus. There’s a lot of uncertainty about how it’s going to impact people individual jobs. There are a lot of jobs specifically in the Seattle area that are easily work but there’s tons of jobs in different industries and we one can’t work from home and we already know people have lost jobs. Plus, there are a lot of people living paycheck to paycheck so for one to say that they know the future for sure is just not feasible. However, I do have a lot of hope based on the housing data I’ve analyzed before and during the virus and shut down.

Real Estate is one of the key drivers of the economy and also the value of your homes directly impacts a person’s net worth, ability to be mobile, etc. So, everyone should at least be a little curious about the industry unless they plan on moving to a trailer park community and even there, I’m sure some kind of economy revolves around that community. Seattle was the first kind of housing hotspot in the US even before New York. Yes New York clearly overtook it and I’m not downplaying that but Seattle was first and because it was first, we can start to see real estate data and start asking questions so we can pay attention to it because if we can understand what’s happening there first we could maybe interpret more and estimate what might happen in other hotspots.

In Seattle in the month of April, house prices significantly dropped of pending sales and numbers of active listings in the market. The average sell price in Seattle for the month of April is $815000 and the number of pending sells that did not go through dropped from 14% to 0% which shows people are either not letting the pandemic affect their home buying plans or that the pandemic is actually making people who planned on buy more anxious, which makes them feel the need to buy a home quickly in fear of unforeseen circumstances. Either way, people who planned on buying now committed to the idea regardless of the current pandemic. 50% of sellers where selling above asking price which is why I concluded that there’s been a decline in buyers’ interest even though all pending sells went through. That only concludes people who are committed to getting a house are fully committed so for the time being there’s not much changes that could be done to improve house prices.

Back in 2006 Zillow created an algorithm to predict sell prices of homes. They called this estimation, Zestimate. The Zestimate used to be a valuation that was placed on every single home that was in the Zillow’s database which had about 43 million homes. In order to successfully maintain this valuation of the homes, the model had to run about once a month and about a couple terabytes of data had to be pushed through about 34000 statistical models. The median absolute percent error back then was 14% which is acceptable, but the new algorithm now brings down the error by almost three times. The new algorithm uses 110 million homes which is a little over double what the last algorithm uses and its inclusive of smaller cities that are harder to estimate like Katy, Springfield, etc. But regardless of that fact, they were still able to bring the median absolute percent error down to 5%. Dr. Stan Humphries blames this improvement on collecting an enormous amount of data and getting way more sophisticated with the algorithms. They went from running 34000 statistical models about once a month to running roughly about 11 million statistical models every night.

The Zestimate’s algorithm has been changed a few times, once in 2008, another in 2011 and most recently 2018. In between all these changes they’re still constantly changing bits and pieces of the framework. Most recently, the major change implemented was the granularity with which these models are being run. The high percent of accuracy was achieved mainly with the use of finer geographic granularities are now being implemented along with the vast number of models that are being generated. Initially, the Zestimate model only used data from counties, but on few occasions when the data turns out to be very sparse, they’ll look at the state as a whole. Today, they make sure not to go above the county level when building the model. For larger counties with a lot of transactions, they break them down into smaller regions within the county and the algorithm tries to find homogenous sets of homes in the subcounty level in order to train the modeling framework. And that modelling framework contains a robust number of models within.

For the rest of the year 2020, Zillow and other related websites are estimating sales to be down by 15% for the year as a whole with larger year-over-year drops in the second quarter which is when the pandemic will mostly affect the housing market. Home prices expected to eke out a small gain and this is because there are few sellers still on market. Other sellers might be holding off on selling in order not to take a loss as about half the buyers on the market are not meeting up with their asking price. The limited inventory helps keep prices up but this balance is expected to shift as the economy slows down later in the year. So, when websites like Zillow and others related begin to test new models to predict home price post-Covid-19 all of these would be needed to take into account and I’ll also advice factoring the amount of COVID-19 cases in each state or county, as well as, the total number of lives lost in each region.


