Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can GradCAM be applied to CNN regression models with multiple input images? #544

Open
njujinchun opened this issue Dec 9, 2024 · 4 comments

Comments

@njujinchun
Copy link

Hello, I am working on a CNN model designed for regression tasks with multiple input images (channels). The goal is to predict a target output image based on multiple predictor images. I would like to know if GradCAM can be used to quantify the contribution of each input image to the predictions. If so, could you please provide guidance or examples on how to implement this?
Thank you very much for your time and assistance!

@jacobgil
Copy link
Owner

Hi,
Sorry for the late reply, hope it is still relevant.

Do you want to quantify the contribution of each input image,
or do you want visual contributions inside each image (e.g, highlighting which pixels contributed more in image #3) ?

For the first option you might need something custom, depending on how the images look.
Option way would be to use ablation - zero out (or another similar ablation where you replace with a constant value, or smooth the image) each channel, run the model, and check the confidence drop.

@njujinchun
Copy link
Author

Hi,

Thank you very much for your reply and clarification. I greatly appreciate your time and assistance.

The problem I am addressing involves using a CNN to model the regression relationship between remotely sensed evapotranspiration (target) and temperature & precipitation (input) images for land regions (with ocean pixels assigned a constant value of zero). Specifically, I am interested in identifying which input pixels contribute the most to predicting evapotranspiration in a region like the Amazon River Basin.

Based on your explanation, it seems my problem aligns with the second case you mentioned—analyzing visual contributions within each image.

Thank you again for your insights, and I look forward to any further guidance or suggestions you may have.

Best regards,

Shaoxing

@jacobgil
Copy link
Owner

I would start with the low hanging fruit, which is getting a CAM image that identifies any relevant pixels.
Does the model have a single output (the regression output) ?
If yes, you can use the RawScoresOutputTarget to get a CAM for pixels that promote a higher regression output.

Following the example in the Readme:

targets = [RawScoresOutputTarget()]
with GradCAM(model=model, target_layers=target_layers) as cam:
  grayscale_cam = cam(input_tensor=input_tensor, targets=targets)

The problem is that this does not tell us if the pixel was important for the temperature image, the precipitation, or both of them. It will only tell us it was relevant for at least one of them.
If this is important for you, maybe one thing you could do is zero out the temperature image (or an alternative ablation that makes sense to you, like replacing it with a constant value or blurring it), and then get the CAM, which should correspond to the CAM for precipitation.

@njujinchun
Copy link
Author

Thank you very much for your detailed reply and explanations. I apologize for the delay in responding, as I just returned from holiday.
My model produces only a single output image. I will implement your suggestions and let you know if I need further assistance. Thank you again for your support—I truly appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants