You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am trying to understand the auxiliary function trick for importance scores linear wrt the black-box model. In the paper it boils down to this:
From the text I don't quite get what the meaning of $\tilde{x}$ is.
What is effectively done when one wants to use a feature importance attribution method on an embedding network $f$ for a specific sample/image $x$?
We calculate the dot product of the vector $f(x)$ with itself and apply the method to that scalar output (i.e. $x = \tilde{x}$)? Or does the $\tilde{x}$ come from somewhere else and we do $f(x) \cdot f( \tilde{x})$?
Also, could you please point to literature or elaborate on what it means for a feature importance score to be "linear with respect to the black-box"?
I would really appreciate an answer!
Kind regards
The text was updated successfully, but these errors were encountered:
Hello,
I am trying to understand the auxiliary function trick for importance scores linear wrt the black-box model.
In the paper it boils down to this:
From the text I don't quite get what the meaning of$\tilde{x}$ is.
What is effectively done when one wants to use a feature importance attribution method on an embedding network$f$ for a specific sample/image $x$ ?$f(x)$ with itself and apply the method to that scalar output (i.e. $x = \tilde{x}$ )? Or does the $\tilde{x}$ come from somewhere else and we do $f(x) \cdot f( \tilde{x})$ ?
We calculate the dot product of the vector
Also, could you please point to literature or elaborate on what it means for a feature importance score to be "linear with respect to the black-box"?
I would really appreciate an answer!
Kind regards
The text was updated successfully, but these errors were encountered: