You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working with some gradient based interpretability method (based on the GradCam code from Keras ) , and I'm running into a result that seems inconsistent with what would expect from backpropagation.
I am working with a pertrained VGG16 on imagenet, and I am interested in find the most relevent filters for a given class.
I start by forward propagating an image through the network, and then from the relevant bin, I find the gradients to the layer in question (just like they do in the Keras tutorial).
Then, from the pooled gradients (pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))), I find the top-K highest/most pertinent filters.
From this experiment, I run into 2 strange results.
For almost any image I pass through (even completely different classes), the network almost always seems to be placing the most importance to the same 1 Filter.
And this result I understand even less; many times, the gradients point "strongly" to a filter, even though the filter's output is 0/negative (before relu). From the backpropagation equation, a negative response should result in a Null gradient, right ?
The issue seems to be related with VGG16 model. You can try Xception model because Xception uses depthwise separable convolutions which will prevent the model from consistently pointing to filter 155 for different images.
TensorFlow version
2.11.0
Custom code
Yes
OS platform and distribution
Windows
Python version
3.7.16
Hello,
I'm working with some gradient based interpretability method (based on the GradCam code from Keras ) , and I'm running into a result that seems inconsistent with what would expect from backpropagation.
I am working with a pertrained VGG16 on imagenet, and I am interested in find the most relevent filters for a given class.
I start by forward propagating an image through the network, and then from the relevant bin, I find the gradients to the layer in question (just like they do in the Keras tutorial).
Then, from the pooled gradients (
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
), I find the top-K highest/most pertinent filters.From this experiment, I run into 2 strange results.
For almost any image I pass through (even completely different classes), the network almost always seems to be placing the most importance to the same 1 Filter.
And this result I understand even less; many times, the gradients point "strongly" to a filter, even though the filter's output is 0/negative (before relu). From the backpropagation equation, a negative response should result in a Null gradient, right ?
If$Activation_{in}\cdot W+b$ is negative, then $\frac{dY_{class}}{Activation_{in}}$ should be 0, right ?
I provided 3 images.
All 3 images point consistently to Filter155 (For observation 1).
And for Img3.JPEG, I find the Top5 most relevant filters: Filter336 has a strong gradient, and yet a completely null output.
Is there a problem with my code, the gradient computations or just my understanding?
Thanks for your help.
Liam
Standalone code to reproduce the issue
Relevant log output
The text was updated successfully, but these errors were encountered: