# A Question About Runner’s World’s Methodology for Best Running Cities

As an avid reader of Runner’s World, I was excited when the current issue contained not only an article titled, “America’s 50 Best Running Cities,” but also the data and methodology for choosing the cities. If you read this blog, you know how I love me some reproducible research.

The methodology, according to the magazine:

We started with a list of 250 U.S. cities with populations of more than 160,000 that had the highest number of households per capita reporting participation in running within the last 12 months (according to the SimplyMap 2014 census study). Then we gathered data from myriad sources to create five indexes of special importance to runners, ranking the cities in each index from 1 to 150. We then weighted the indexes and tallied up the scores to create the final list.

The indexes and weights are described below (with more-detailed descriptions in the article):

• Run (40%) – Presence of RRCA- and USATF-sanctioned clubs, as well as races and running stores.
• Parks (20%) – The number of (and access to) trails, open spaces, running tracks, and other fitness facilities.
• Climate (20%) – An index of ideal running weather, including precipitation levels, air quality and daily average temperatures closest to 55 degrees Fahrenheit.
• Food (10%) – Analysis of residents’ access to healthful food and farmer’s markets.
• Safety (10%) – Measure of crime and traffic incidents involving pedestrians.

My original idea for this post was to explore the different variables on a map and maybe adjust the weights a little to see different outcomes. (Safety, in my opinion, was underweighted, especially since traffic deaths and injuries are a major problem in San Francisco.) But I ran into an issue.

Using the data (the full table was included in the magazine), I decided to recreate the weighted scores using the R script below:

Note: I also tried the script with R’s weighted.mean function and received the same results. The math is the same, I was just double-checking my arithmetic.

The table below shows the outcome of the script, sorted by the weighted final score (ascending). You can see the issue right away by looking at the “Ranking” column. Portland, which came in 6th, is suddenly above Washington, D.C., which came in 5th. (And being a current resident of the D.C. area and having visited Portland, I agree with that new ranking because … well, weather, mostly).

City Run Parks Climate Food Safety Weighted Score
1 San Francisco, CA 1 5 6 19 146 19.10
2 Seattle, WA 3 3 47 17 93 22.20
3 Boston, MA 9 2 48 9 116 26.10
4 San Diego, CA 10 34 9 30 97 25.30
5 Washington, DC 5 4 77 2 132 31.60
6 Portland, OR 7 9 37 24 98 24.20
7 Minneapolis, MN 18 6 100 3 87 37.40
8 New York, NY 4 20 42 39 108 28.70
9 Omaha, NE 28 11 90 26 20 36.00
10 Denver, CO 20 7 68 51 122 40.30
11 Chicago, IL 8 13 124 13 111 43.00
12 Madison, WI 38 19 103 1 8 40.50
13 Colorado Springs, CO 12 71 67 41 10 37.50
14 San Jose, CA 43 29 24 12 92 38.20
15 Los Angeles, CA 33 48 12 44 128 42.40
16 Rochester, NY 29 55 40 22 68 39.60
17 Pittsburgh, PA 25 16 78 52 57 39.70
18 Tucson, AZ 22 82 32 67 66 44.90
19 Raleigh, NC 13 65 86 59 31 44.40
20 Boise, ID 62 68 22 18 4 45.00
21 Oakland, CA 63 10 16 16 137 45.70
22 Philadelphia, PA 11 31 73 56 142 45.00
23 Sacramento, CA 23 33 49 78 104 43.80
24 St. Louis, MO 17 8 127 92 101 53.10
25 Buffalo, NY 44 18 66 25 77 44.60
26 Virginia Beach, VA 46 67 36 53 18 46.10
27 St. Paul, MN 53 21 94 14 50 50.60
28 Richmond, VA 34 72 83 69 15 53.00
29 Santa Rosa, CA 103 63 14 5 3 57.40
30 Charlotte, NC 14 102 70 116 55 57.10
31 Las Vegas, NV 19 126 27 108 135 62.50
32 Tampa, FL 15 24 143 119 119 63.20
33 Lincoln, NE 72 37 91 23 1 56.80
34 Albuquerque, NM 54 52 30 86 129 59.50
35 Cleveland, OH 60 32 72 77 40 56.50
36 Cincinnati, OH 42 23 118 90 65 60.50
37 Milwaukee, WI 64 25 54 47 112 57.30
38 Atlanta, GA 32 28 82 120 32 50.00
39 Des Moines, IA 69 62 110 7 145 77.20
40 Irvine, CA 106 51 4 63 45 64.20
41 Salt Lake City, UT 58 76 43 83 46 59.90
42 Baltimore, MD 57 15 69 87 138 62.10
43 Spokane, WA 73 93 35 50 26 62.40
44 Honolulu, HI 65 47 126 4 80 69.00
45 Indianapolis, IN 2 119 133 123 120 75.50
46 Phoenix, AZ 30 109 62 103 118 68.30
47 San Antonio, TX 16 94 111 122 130 72.60
48 Miami, FL 37 22 150 34 148 67.40
49 Oklahoma City, OK 27 103 101 93 94 70.30
50 Houston, TX 6 77 122 143 139 70.40

I thought it might be a data entry error, so I took 10 random observations from the dataframe and double-checked the numbers. Everything is correct.

So what did I do wrong?