Cook County Assessor Model & Valuation Data Release
What is the purposes of this data?
One of the goals Assessor Kaegi set was the publication of residential Computer Assisted Mass Appraisal (CAMA) code and data. These data sets fulfill part of that goal.
This data, in conjunction with our published code, will allow any technically proficient member of the public to re-construct our first-pass residential valuation process. We have intentionally avoided using expensive software; our entire modeling process is done in R and RStudio, a free statistical program. This helps minimize the barriers between our internal process and scrutiny by journalists and academics.
Our modeling process draws data from a number of storage locations, which are not accessible to the public. In lieu of such access, we have replicated this data to publish through the County’s Open Data Portal. These data act as replacements for the queries in the CAMA code that cannot be used outside the office, allowing the code to function in any environment.
Complete and consistent data is central to successfully producing accurate, fair assessments. In order to give the public the clearest picture of the state of our data, we have re-created the data as it exists in our production databases.
At the time of publication, the current assessor held office for four months. As such, we offer two disclaimers:
- First, in instances where this data conflicts with the taxpayer’s assessment notice, the taxpayer’s notice takes precedence.
- Second, we are publishing data that may contain errors so this is provided as-is. For this first publication, we have attempted to document instances of ambiguity and inaccuracy.
One example of a field with issues is 'Other Improvements'. There is a disconnect between the paper property inspection cards used by CCAO field staff and the data system that information is entered into. The field cards have three separate fields to record 'Other Improvements.' Such improvements including pools, private tennis courts, yoga sheds, etc. The AS400, the system of record in the office, only has a single field in which to record other improvements. In instances where a property has multiple improvements, both were entered into this field without a delimiter. This has made it impossible to determine algorithmically whether a 12 is a 1 & 2, or a 12, rending this field mostly useless for modeling purposes.
Another example is the 'Age' field. In a more advanced data system, you might capture multiple age characteristics: age of original structure, age of interior, age of bathrooms, age of kitchen, etc. In the AS400 system, one of the CCAO’s legacy systems, we can only store a single field, 'Age'. This means that this field is mixed-use, sometimes capturing original structure age, someone capturing effective age, or age from most recent major renovation. While this field is predictive of property value, it is not well defined.
Other examples are documented in our data dictionary.
How should I use this data?
Our residential modeling code can be downloaded and modified to run on your local PC or Mac using the data published through the Open Data Portal. Whether you are an academic, a journalist, or a property tax professional, we hope that this portal and our code are useful to you. Please be aware that the CCAO’s code is published under a GNU Affero General Public Use License, and you should not use CCAO’s code in any manner that conflicts with this license.
The CCAO is currently working on a collaborator policy. When a final policy is published, the CCAO will welcome suggestions on code, modeling, and/or data improvements.
What shouldn't I try to do with this data?
Don’t use this data to look up basic information about your property – there is an easier and quicker way to do that. If you are a taxpayer looking for explanations about your property’s values, appeal status, or other questions pertaining to your assessment, please visit www.cookcountyassessor.com, or call (312) 443-7550 to speak with a taxpayer information specialist.
Where does this data fit in the assessment system?
When we think about predicting market values for residential properties, we must answer two basic questions: what data do we use to characterize values for each submarket area; and what is the universe of properties that we need to value? Table 1, Model Data is the answer to the first question, and Table 2, Assessment Data is the answer to the second. First Pass Values contains the results of the process at each step. These steps are outlined in our code.
What is Model Data?
Model Data contains every valid arm’s length transaction in a specified geographic area and time period. We define valid arm’s length as a sale where the buyer and seller act independently and do not have any relationship to each other. We have included property characteristics at the time of sale, as well as location and property attributes for contextualization. Property characteristics include the number of rooms, bathrooms, size of garage, exterior construction of the property, and whether the home has a finished basement. Property Attributes include census tract, assessor neighborhood code, a geographically determined location factor, and street address. We use Model Data to estimate a wide range of predictive models that help us characterize home values in a given area. We then select the best performing models to use to value properties in Assessment Data.
What is Assessment Data?
Where Model Data is a data set of sales, Assessment Data is a data set of properties, even ones that have not sold in a long time. Because these properties still need to be valued, we use the best performing models from Model Data to estimate the market value for the properties contained within the Assessment Data table.
What are first pass values?
First Pass Values are the values upon which re-assessment notices are based. They are the product of our modeling process and post-modeling adjustments. In this data, we have provided each value at each step in the valuation process. Each post-modeling adjustment produces a new set of estimated values 2 through 7. These values, and the resulting ratios, are stored in Table 3, First Pass Values, which reports the estimated market values of properties at each stage in the process.
First Pass Values are not final assessments. After first-pass notices are mailed, the assessor finalizes assessments in township order, and sends those assessments to the Board of Review (BOR), and then to the Property Tax Appeal Board. Later, changes from things like Certificates of Error may also change assessments.
I emailed the database owner over a week ago - why haven't they responded?
The CCAO has limited human resources. We really want to answer all of your questions, but the central mission of the office is, first and foremost, the production of assessments for taxpayers. We will respond to all questions in due course.
A note about replication
This data was published on April 16, 2019. Since then, we may have made changes to our valuation scripts available on GitLab. In order to minimize the extent to which you will have to alter our scripts to make them run with the data we have made available here, we recommend using version 8194639e of Residential, Maine branch and version e1122159 of Utility, Maine branch.