Map of Bicycle-Friendliness in Cambridge, MA

This map is an entry for the Street Safety Challenge. Each colored circle is located at an intersection. The color of the circle is a shade between red (bad) and green (good), based on a metric of bike-friendliness within a 0.25 mile radius of that intersection.

Where did the metric come from? Open data on Cambridge's traffic accidents, pothole-repair requests, bike theft, and existing bicycle infrastructure. I weighted these factors in a way that made sense to me, but you can weight them differently (see the "options" section below). For example, a confident cyclist who owns an expensive bike might be especially concerned with bike theft, while a new cyclist might want to know which streets have good bicycle infrastructure and few traffic accidents.

This page was created by Jennifer Melot. For a more detailed description of the process, data, and resources I used, see the "details" section below.

Pull the sliders to indicate how important each of the following are to you.
Bicycle infrastructure
Traffic accidents
Pothole reports
Bicycle thefts


The specific data I used and comments about how I processed each dataset can be seen below. The text processing was all done with a series of python scripts, which can be found on my github page.

ACCIDENTS: I used absolute numbers of all types of accidents instead of scaling by traffic volume, partially because a quick search did not yield traffic volume data I could easily incorporate. I did not include just bike-related accidents because I wanted to capture the unfriendliness of a street where vehicles frequently have accidents and where few cyclists will ride (thus resulting in few bike-related accidents).

This was the only dataset that I was able to divide up by time of day in what I felt was a meaningful way.

CRIME REPORTS: This dataset contained the bicycle theft data. Again, lacking a way to normalize for number of bicycles parked at a certain location, I used absolute numbers of reports. I had initially planned to divide this dataset by time of day, but unfortunately a large portion of the reports were not very specific about when the bicycle was stolen.

POTHOLES: Each pothole location was counted once, to avoid bias toward potholes in high-traffic locations. What I was trying to get at with this dataset was the unfriendliness of a road with lots of rough pothole patches -- excessively bumpy roads aren't comfortable on a bike! I think that this is the most marginally relevant or informative of the datasets I included.

BIKE INFRASTRUCTURE: This was by far the most difficult dataset to incorporate. I put the work in because it is arguably the most important. I used the KML file from the Bike Facilities layer at the link above, and extracted the line segments corresponding to existing infrastructure. I treated all kinds of infrastructure (bike lanes, multi-use paths, etc.) the same, because it was not obvious to me how the different kinds of infrastructure were indicated in the KML file.

Once I had the line segments, for each intersection i, I calculated the total length of all the line segments within a 0.25 mi radius of i. The assumption here was that a location with a lot of bike infrastructure nearby is relatively bicycle friendly.

INTERSECTIONS: I chose intersections as locations to display bike friendliness because they are (more or less) evenly dispersed, follow roadways, and were available as an open dataset.

I used Google maps API to draw the points, and the Google geocoder to get coordinates of points when that information was not included with a dataset.

Note that intersections very close to the Cambridge line may have scores of lower accuracy if the town line passes through a 0.25 mi radius circle around the intersection, due to lack of data outside the Cambridge city limits. Given more time to develop this map (or data from surrounding towns), I could have more carefully scored these intersections. Still, it isn't apparent to me that this has had a large effect on the accuracy of the current map.