Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added checklist item for downstream bias mitigation #119

Merged
merged 5 commits into from
Dec 12, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ Options:
- [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?
- [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?
- [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?
- [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?

## B. Data Storage
- [ ] **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)?
Expand Down
5 changes: 4 additions & 1 deletion deon/assets/checklist.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
title: Data Science Ethics Checklist
sections:
sections:
- title: Data Collection
section_id: A
lines:
Expand All @@ -12,6 +12,9 @@ sections:
- line_id: A.3
line_summary: Limit PII exposure
line: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?
- line_id: A.4
line_summary: Downstream bias mitigation
line: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?
- title: Data Storage
section_id: B
lines:
Expand Down
18 changes: 12 additions & 6 deletions deon/assets/examples_of_ethical_issues.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,14 @@
url: https://www.theguardian.com/technology/2014/jun/27/new-york-taxi-details-anonymised-data-researchers-warn
- text: Netflix prize dataset of movie rankings by 500,000 customers is easily de-anonymized through cross referencing with other publicly available datasets.
url: https://www.wired.com/2007/12/why-anonymous-data-sometimes-isnt/
- line_id: A.4
links:
- text: In six major cities, Amazon's same day delivery service excludes many predominantly black neighborhoods.
url: https://www.bloomberg.com/graphics/2016-amazon-same-day/
- text: Facial recognition software is significanty worse at identifying people with darker skin.
url: https://www.theregister.co.uk/2018/02/13/facial_recognition_software_is_better_at_white_men_than_black_women/
- text: -- Related academic study.
url: http://proceedings.mlr.press/v81/buolamwini18a.html
- line_id: B.1
links:
- text: Personal and financial data for more than 146 million people was stolen in Equifax data breach.
Expand Down Expand Up @@ -52,6 +60,8 @@
links:
- text: Misleading chart shown at Planned Parenthood hearing distorts actual trends of abortions vs. cancer screenings and preventative services.
url: https://www.politifact.com/truth-o-meter/statements/2015/oct/01/jason-chaffetz/chart-shown-planned-parenthood-hearing-misleading-/
- text: Georgia Dept. of Health graph of COVID-19 cases falsely suggests a steeper decline when dates are ordered by total cases rather than chronologically.
url: https://www.vox.com/covid-19-coronavirus-us-response-trump/2020/5/18/21262265/georgia-covid-19-cases-declining-reopening
- line_id: C.4
links:
- text: Strava heatmap of exercise routes reveals sensitive information on military bases and spy outposts.
Expand All @@ -62,8 +72,6 @@
url: https://www.bbc.com/news/magazine-22223190
- line_id: D.1
links:
- text: In six major cities, Amazon's same day delivery service excludes many predominantly black neighborhoods.
url: https://www.bloomberg.com/graphics/2016-amazon-same-day/
- text: Variables used to predict child abuse and neglect are direct measurements of poverty, unfairly targeting low-income families for child welfare scrutiny.
url: https://www.wired.com/story/excerpt-from-automating-inequality/
- text: Amazon scraps AI recruiting tool that showed bias against women.
Expand All @@ -74,6 +82,8 @@
url: https://www.whitecase.com/publications/insight/algorithms-and-bias-what-lenders-need-know
- line_id: D.2
links:
- text: Apple credit card offers smaller lines of credit to women than men.
url: https://www.wired.com/story/the-apple-card-didnt-see-genderand-thats-the-problem/
- text: Google Photos tags two African-Americans as gorillas.
url: https://www.forbes.com/sites/mzhang/2015/07/01/google-photos-tags-two-african-americans-as-gorillas-through-facial-recognition-software/#12bdb1fd713d
- text: With COMPAS, a risk-assessment algorithm used in criminal sentencing, black defendants are almost twice as likely as white defendants to be mislabeled as likely to reoffend.
Expand All @@ -84,10 +94,6 @@
url: https://www.liebertpub.com/doi/pdf/10.1089/big.2016.0047
- text: Google's speech recognition software doesn't recognize women's voices as well as men's.
url: https://www.dailydot.com/debug/google-voice-recognition-gender-bias/
- text: Facial recognition software is significanty worse at identifying people with darker skin.
url: https://www.theregister.co.uk/2018/02/13/facial_recognition_software_is_better_at_white_men_than_black_women/
- text: -- Related academic study.
url: http://proceedings.mlr.press/v81/buolamwini18a.html
- text: Google searches involving black-sounding names are more likely to serve up ads suggestive of a criminal record than white-sounding names.
url: https://www.technologyreview.com/s/510646/racism-is-poisoning-online-ad-delivery-says-harvard-professor/
- text: -- Related academic study.
Expand Down
Loading