Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribute any data errors found here (plus tracking of corrections) #49

Open
1 task done
daveluo opened this issue Mar 23, 2020 · 4 comments
Open
1 task done
Labels
Data Gathering Work for Gathering, Cleaning & Cataloging data

Comments

@daveluo
Copy link
Collaborator

daveluo commented Mar 23, 2020

This issue serves as a ongoing thread (for now at least) to keep track of specific data errors found by us or anyone else using our datasets. These could be errors spotted while looking at the visualizations, the CSVs or GeoJSONs, or while working with the data as an input to your applications.

We're focusing on the hospital facility-level data as the primary place to find errors because it's where several datasets converge and the data of higher spatial groupings like county, state, HRR all are aggregated from this facility-level data. That said, flagging any errors spotted in any of the data files here would be much appreciated!

Please add comments to this issue with any data point(s) you see that may be erroneous. The minimal amount of info to flag a data error in order to be helpful and lead to a quicker fix would be:

  • Facility name, city, state as exactly written in the data (or the CCM_ID field if you're looking at the CSV or GeoJSON files)
  • What data field(s) is suspected or definitely incorrect and a short description of why
  • If you have it, what is the correct value(s) and from what data source did you obtain this (i.e. the health facility's website, you work there, cross-referencing against another dataset)

We will monitor and update this issue regularly to correct the data as it is reported and also address any systemic data ingestion or processing issues that may be causing more widespread errors (like seeing multiple facilities having the same type of error).

A spreadsheet has been created to track errors reported, staged for correction, and manually corrected here: https://docs.google.com/spreadsheets/d/1cTJeIMn_V1XQnWe2CPcJgi0m-iPIn3rdtY6HOHRMFbA/edit?usp=sharing

  • Set up a Google Form to report data errors that auto feeds into the staging sheet of above spreadsheet
@daveluo daveluo changed the title Contribute data errors found here (plus tracking of manual corrections) Contribute any data errors found here (plus tracking of corrections) Mar 23, 2020
@daveluo daveluo added the Data Gathering Work for Gathering, Cleaning & Cataloging data label Mar 23, 2020
@daveluo
Copy link
Collaborator Author

daveluo commented Mar 24, 2020

Thanks to Andrew:

NYP is missing at Columbia Univiersity Medical Center at 166th street

This is also not in DH data source so we're going to manually add it in

@daveluo
Copy link
Collaborator Author

daveluo commented Mar 25, 2020

Data errors or additions can now also be submitted via google form at https://forms.gle/vPqPpgAwqoUgep47A which can also be found as the "Update data" link at the top of the map viz

I'll keep this issue open for now to facilitate more data error reporting and for the tracking of corrections.

@daveluo
Copy link
Collaborator Author

daveluo commented Mar 25, 2020

Analysis re: why:

NYP is missing at Columbia Univiersity Medical Center at 166th street

The facilities (NYP - Columbia UMC and NYP Childrens Hospital next door) were in neither of the 2 main hospital datasets we merge and cross reference against each other so the omission slipped through. We have a 3rd dataset that we haven't added to use yet mainly because of time (HIFLD) but it did have these facilities so we manually added in the info from there (supplemented by NY state data)

Going forward, we'll be prioritizing the integration this 3rd facility dataset (#25) as another crosscheck and adding of facilities data which should be done in the next few days and we also can have state-by-state hospital licensing datasets on hand that we can manually crosscheck against as a 4th option

@alabamacajun
Copy link

Data is collection is slow due to the lack of information. I started in St John Parish Louisiana which had no data. The parish which once had River Parishes Hospital with reported 82 beds. If was bought out and replaced with a 13 bed emergency care center which is mostly an ER and Outpatient facility. This happens to be one of the hot spots for death rates per capita. Seeing this case was sad remembering the original facility being a full hospital and later a doctors center was added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Gathering Work for Gathering, Cleaning & Cataloging data
Projects
None yet
Development

No branches or pull requests

2 participants