The lighter side of real estate

Best Cities in State: Crtieria Selection Methodology

TL;DR Answer

We use seven evenly weighted criteria to determine the best city in the state. We selected the criteria on the basis that they do not compound on each other to give one type of city (large, small, coastal, etc) an advantage. The criteria we settled on were:

  • Total Amenities
  • Quality of Life
  • Total Crimes
  • Tax Rates
  • Unemployment
  • Commute Time
  • Weather

Background & Assumptions

  • We want criteria that is applicable to a large number of cities and/or states. So something like “Most turkish delight stores per capita” wouldn’t work because the vast majority of towns, and even cities, don’t have a turkish delight store.
  • We want to make sure that that the correlations were representative of the United States as a whole and not to just a particular geographic subsection of the United States.
  • For the sake of argument, we are setting low correlation < 33%, middle correlation 33-67%, and high correlation 67-100%.


  1. We downloaded a list of the USPS’s zip code listings and omitted the duplicates due to some cities having several zip codes and irrelevant cities (Puerto Rico, etc.).
  2. This left us with 21,405 rows of data, which would be inefficient for both data collection and analysis. So we used a random number generator to come up with a truly random sample of this national data.
  3. We decided to go with N = 184, which gave us a confidence level of 90% and a confidence interval of 6.10%, plenty sufficient for Saturday Night Scientists.
  4. We gathered the data from the census, BLS, Sperling’s Best Places, and Yelp.
  5. After omitting unusable data (industry fractions, etc), we used Excel’s Data Analysis package to automatically create a Correlation Matrix

Possible Criteria

  • Population
  • Amenities (Total)
  • Cost of Living
  • Price of Gas
  • Total Crimes/100K
  • Violent Crimes/100k
  • High School Graduation Rate
  • Students/Teacher
  • Median Household Income
  • Commute Time
  • Median Home Price
  • Median Rent
  • Temp
  • Air Quality
  • Unemployment Rate
  • Recent Job Growth
  • Sales Taxes
  • Income Taxes

Correlation Magic (Click the chart to enlarge)
Correlation Matrix


  • One of our original hypothesis was that a bunch of our criteria would be highly correlated with population. However, it turns out only Total Amenities is highly correlated (Which should be patently obvious). We found this finding particularly surprising
  • As we probably all could have guessed, Cost of Living, Median Home Price, Median Rent, Median Household Income and High School Graduation Rate are all highly-correlated. We’ll call these the “Quality of Life” factors.
  • Another one that should be obvious is the high correlation between Total Crimes and Violent Crime.
  • Tax Rates (Income and Sales) both have mostly low correlations with every other variable.
  • Unemployment Rate can still stand on its own, as it has very low correlation with other variables.
  • Commute Time is actually less correlated with population than we originally thought.
  • Temperature and Air Quality could technically stand alone, or work together nicely as a “Weather” factor.

Resulting Criteria

All top level criteria will be given an equal weight in the ranking calculation:

  • Total Amenities (Total # of reviewed businesses per Yelp)
  • Quality of Life (Cost of Living, Median Home Price, Median Rent, Median Household Income, High School Grad Rate)
  • Total Crimes
  • Tax Rates (Sales, Income)
  • Unemployment
  • Commute Time
  • Weather (Temperature, Air Quality)