Skip to content

Code for the paper titled "Inferring Sensitive Attributes from Model Explanations" published in ACM CIKM 2022.

Notifications You must be signed in to change notification settings

vasishtduddu/AttInfExplanations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Inferring Sensitive Attributes from Model Explanations

Code for the paper titled "Inferring Sensitive Attributes from Model Explanations" published in ACM CIKM 2022.

Requirements

You need conda. Create a virtual environment and install requirements:

conda env create -f environment.yml

To activate:

conda activate attinf-explanations

To update the env:

conda env update --name attinf-explanations --file environment.yml

or

conda activate attinf-explanations
conda env update --file environment.yml

Dataset

Link to datasets: https://drive.google.com/drive/folders/1bUH02Y9I6_NVrfo5_8PwWtdklk15rXPJ

Usage

Evaluate attribute inference attacks of against explanations

python -m src.attribute_inference --dataset {LAW,MEPS,CENSUS,CREDIT,COMPAS} --explanations {IntegratedGradients,smoothgrad,DeepLift,GradientShap} --attfeature {both,expl}

attfeature evaluates the attacks on only explanations (expl) or both predictions and explanations (both)

Attacking using entire explanations for both sensitive and non-sensitive attributes

python -m src.attribute_inference --dataset {LAW,MEPS,CENSUS,CREDIT,COMPAS} --explanations {IntegratedGradients,smoothgrad,DeepLift,GradientShap} --attfeature expl --with_sattr True

Attacking using only explanations corresponding to sensitive attributes

python -m src.infer_s_from_phis --dataset {LAW,MEPS,CENSUS,CREDIT,COMPAS} --explanations {IntegratedGradients,smoothgrad,DeepLift,GradientShap}

Update (2024): Bug Fix

There was a bug in one of the parameters for generating explanations: "target" was initially set to 0 but it has to be set to the class for the input. This has been updated. The attack accuracies are different and results in some cases better than what was reported in the paper since the gradients are computed with respect to the correct class. The conclusions in the paper that model explanations leak sensitive attributes is still valid.

About

Code for the paper titled "Inferring Sensitive Attributes from Model Explanations" published in ACM CIKM 2022.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages