Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meaning of x-tilde and "importance score linear wrt the black-box" #2

Open
dennysemko opened this issue May 24, 2023 · 0 comments
Open

Comments

@dennysemko
Copy link

Hello,
I am trying to understand the auxiliary function trick for importance scores linear wrt the black-box model.
In the paper it boils down to this:

$$b_i(f,x) \equiv a_i(g_x,x)$$

$$g_x = \sum_{j=1}^{d_H}f_j(x) \cdot f_j(\tilde{x})$$

From the text I don't quite get what the meaning of $\tilde{x}$ is.

What is effectively done when one wants to use a feature importance attribution method on an embedding network $f$ for a specific sample/image $x$?
We calculate the dot product of the vector $f(x)$ with itself and apply the method to that scalar output (i.e. $x = \tilde{x}$)? Or does the $\tilde{x}$ come from somewhere else and we do $f(x) \cdot f( \tilde{x})$?

Also, could you please point to literature or elaborate on what it means for a feature importance score to be "linear with respect to the black-box"?

I would really appreciate an answer!
Kind regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant