Data & Methodology


To collect data that could speak to how the region of Western Queens had changed over time, specifically with respect to population density and demographics, housing density and costs, business and zoning. The ultimate goal is to map these changes dating back to just before the establishment of public housing – nearly a century ago.


This first step shows these changes within the realm of publically-available and useable data. This limits the use of census data, for which reliable spatial files extend to 2000 only. Data provided by the city was equally unreliable over longer time periods, with the majority of data extending to early 2000 only. A future step will focus on addressing this gap.

The following are details of the data and methodology used for this first step.

Data Sources:

This stage relied largely on 5 data sources:

  1. NYC Open Data, NYC Gov
  2. BYTES of the Big Apple, NYC Gov
  3. Newman Library, Baruch College, CUNY
  4. American FactFinder, US Census Bureau
  5. Tiger / Line Shapefiles, US Census Bureau

Method of Data Collection & Cleaning

Relevant data files were downloaded from the sources listed above. This is a broad sweep, pulling down everything that seemed like it might be useful. Files were also pulled down in multiple file formats when available.

Then each file was reviewed for its content and sorted into a folder for cleaning or to be put on hold and revisited later or not at all. This was based on relevancy to my priorities of data to be mapped.

Rather than manipulating the raw data files directly, data was copied over to new spreadsheets and modified there.
The first round of modification was to identify which variables were of interest.
The second round of modification was to identify if any variables should be combined or re-represented (as percentages for example).
The third round of modification was to change variable names to something that was identifiable but readable by computers and programs as a variable (i.e. Race White Alone was recombined to rac_white).

The next step was to map the data. The first phase of mapping used ArcMap.
First, the geographies had to be prepared. Shapefiles for census tracts and geographic markers like waterways, parks, and transit.
Second, census data for population and housing characteristics needed to be joined to census files. This step-wise guide from the Scholar’s Lab was especially helpful in further modifying the data sheets and the shapefile to accommodate this join.

From here, the data could be explored and maps could be exported.

Also, data files could be downloaded as whole shapefiles, and those shapefiles could be converted to geoJSONS, which would be needed in the next phase of this project.

File Access

Data files at various stages have been made available via Google Drive.