It takes a month for NYC’s Department of Housing Preservation and Development to resolve urgent 311 requests in some neighborhoods. Inspections are a bottleneck.

Background

Every year, hundreds of thousands of New Yorkers turn to 311 operators to escalate building repairs that landlords refuse to address. Many of these issues concern critical utilities such as heat and water, and require immediate attention. In this project, I analyze the text of millions of complaint resolutions to assess whether requests are equitably and promptly processed by the Department of Housing Preservation & Development (HPD). Using heat map tables, animated spatial maps, and network visualizations, I visualize and unpack the results.

Screen+Shot+2020-11-16+at+1.28.36+PM.jpg

Getting started

I used the NYC Open Data API to pull 6 million housing-related 311 requests filed between 2010-2020. Most of the information regarding how each claim was processed and whether each claim was resolved is stored in an underutilized, unstructured paragraph, so I first used a text analysis technique to break apart these paragraphs into tokens that enable quantitative analysis. A token is simply a phrase that captures an important piece of information in a paragraph. For example, if one resolution description says, “Resolved by contacting a tenant” and another says, “Solved by calling someone in the building” I could tag both of them with the tokens “Fixed” and “Contacted resident”.

Screen Shot 2020-11-19 at 4.08.42 PM.png

Exploratory Dashboard

I built a dashboard of interactive heat map tables and animated maps to identify inequities and temporal trends that require further investigation.

Screen Shot 2020-11-23 at 9.29.40 AM.png

Initial Insights

From the initial dashboard, I found three interesting trends, and chose to dive deeper into the one which I couldn’t explain: Despite using the same inspection-based protocol to process all complaints, the HPD department takes two days to respond to issues in some neighborhoods and over a month in others. I assumed at first that this was a supply and demand issue. Some neighborhoods register many more complaints than others, and the HPD department has a limited amount of staff to respond. This does not explain it though. The areas with the most complaints, were in fact not the areas suffering from untenable wait times. I decided to take a different approach to try to understand (see network analysis).

Screen Shot 2020-11-16 at 12.17.05 PM.png

Digging deeper with networks

I wanted to better understand the life cycle of the complaints. Were certain types of issue more likely to face roadblocks? Could discrepancies in processing explain the differential in wait times? To better understand the variation in resolution methodology, I used the tokens from my text analysis to create a Sankey diagram that visualizes how complaints of each type typically are processed, and what types of resolution this typically enables. I found that irregardless of complaint type, the HPD frequently assigns inspections that fail (ex. nobody is home, the inspector doesn’t have enough time to complete inspection, etc.). It is important that the HPD department address this inefficiency if they want to decrease response times.

Like most government data, the 311 records were littered with data integrity and consistency issues. You can find the Python code used to pull and clean the records, along with the code behind the visualizations in my GitHub repo.

Want to learn more about this project? Read my blog posts on the analysis methodology and visualization design.

Previous
Previous

Unraveling misconceptions of flood risk

Next
Next

Race and building density