You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Curious if you've seen any gap in section E: Deployment when discussing in workshops etc. around monitoring. The issues in there right now touch on redress, unintended use, etc. but not really monitoring of the impact on individuals when the model is deployed. this gets discussed e.g. in weapons of math destruction and has been coming up as more models are deployed
I might phrase the item something like: **E.1 Human review**: what is our plan for monitoring the impacts of the model at scale on the humans behind the data points, especially for cases where the model performs relatively poorly or with lower confidence?
(and then incrementing the other E items by one, E.2 thru E.5)
This also feels pretty clearly different from concept drift which is more about the distribution changing relative to model development and less about the model being used as intended at scale.
The text was updated successfully, but these errors were encountered:
I think this is a good item to add to the list, and I agree on the numbering. This is basically "Does this model work as we intend at all?". The current E.1 speaks to redress when something goes wrong, but the "are we actually checking if something goes wrong" is left implicit.
Thanks @jayqi ! I'd be glad to make a PR but I don't know the proper way to add (last time I edited the files directly in multiple places, which emily fixed when implementing)
If this issue is addressed then glad to be a beta tester following it #89
Dropping in issue per Emily's suggestion.
Curious if you've seen any gap in section E: Deployment when discussing in workshops etc. around monitoring. The issues in there right now touch on redress, unintended use, etc. but not really monitoring of the impact on individuals when the model is deployed. this gets discussed e.g. in weapons of math destruction and has been coming up as more models are deployed
Example: Dutch Prime Minister and entire cabinet resign after investigations reveal that 26,000 innocent families were wrongly accused of social benefits fraud partially due to a discriminatory algorithm
I might phrase the item something like:
**E.1 Human review**: what is our plan for monitoring the impacts of the model at scale on the humans behind the data points, especially for cases where the model performs relatively poorly or with lower confidence?
(and then incrementing the other E items by one, E.2 thru E.5)
This also feels pretty clearly different from concept drift which is more about the distribution changing relative to model development and less about the model being used as intended at scale.
The text was updated successfully, but these errors were encountered: