Screenshot of Spreadsheets of datasets collected and reviewed

Data & Methodology – Phase 1

Overview/Goal:

To collect data that could speak to how the region of Western Queens had changed over time, specifically with respect to population density and demographics, housing density and costs, business and zoning. The ultimate goal is to map these changes dating back to just before the establishment of public housing – nearly a century ago.

Limitations:

This first step shows these changes within the realm of publically-available and useable data. This limits the use of census data, for which reliable spatial files extend to 2000 only. Data provided by the city was equally unreliable over longer time periods, with the majority of data extending to early 2000 only. A future step will focus on addressing this gap.

The following are details of the data and methodology used for this first step.

Data Sources:

This stage relied largely on 5 data sources:

  1. NYC Open Data, NYC Gov
  2. BYTES of the Big Apple, NYC Gov
  3. Newman Library, Baruch College, CUNY
  4. American FactFinder, US Census Bureau
  5. Tiger / Line Shapefiles, US Census Bureau

In this google spreadsheet, you can see the specific tables under consideration, and follow links to where this data was originally pulled from.

Methodology:

GeoJSON, Shapefiles, CSVs, GDBs, and other types of files were pulled from the sources listed above. The goal was to organize the data in these files in a Leaflet map to tell a story about changes in Western Queens over time. There were few changes to the actual data, but filed needed to be refined, combined, and converted to present on the Leaflet map. Processing and preparing of the data is outlined below by data type.

GeoJSON files plug directly into Leaflet with the right syntax. Plugins are also written in javascript so they tend to play nice. These data files have required no manipulation from their original form.

Shapefiles can be used in Leaflet using plugins that convert them to geoJSONs or allow them to take on the capabilities of a geoJSON. So far, this has not worked for me. I will use GDAL, a command line tool, to convert shapefiles to geoJSONs.

Initially thirty data tables (ACS 2017 5 YR dataset) describing population and housing characteristics were pulled. These were reviewed and specific attributes were selected and reorganized in new data tables.

Find this data here. (Databook) These are the CSVs..

Census data is not inherently geo-referenced. The suggested approach is to “join” the CSV with a TIGER/Line Shapefile (provided by the US Census Bureau) in ArcMap or QGIS to create a new shapefile. This joined layer can be exported as one new shapefile. Using GDAL, these can be converted to geoJSON and added to the map in Leaflet. Towards this end:

  1. NYC Geographies TIGER/Line Shapefiles for census tracts were collected by county. These were merged to create a single shapefile including census tracts across NYC counties using the instructions in this video. This process created two new files, one for 2010 (nycCT2010.shp) and one for 2000 (nycCT2000.shp).
  2. These two new shapefiles were joined with CSVs with census data tables.
  3. Each joined layer was exported as one shapefile to create an additional new shapefile with the attributes from the corresponding CSVs.
  4. These shapefiles can be converted to geoJSON using GDAL.

I am currently on hold with the GBD and other database files. These files were not necessarily unique; shapefiles conveying the same or similar data was also available. I want to review them at a later time because I’m curious whether a database in/with Leaflet is the way to go. This is a next step.

Leave a Reply

Your email address will not be published. Required fields are marked *