Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Loss of InfoNCE and Cluter_results #22

Open
lsyysl9711 opened this issue Mar 15, 2023 · 2 comments
Open

About Loss of InfoNCE and Cluter_results #22

lsyysl9711 opened this issue Mar 15, 2023 · 2 comments

Comments

@lsyysl9711
Copy link

Hi,

  1. I notice that the labels created in InfoNCE loss is always a zero-vector:(

    labels = torch.zeros(logits.shape[0], dtype=torch.long).cuda()
    )
    I think this is wrong since otherwise the loss will always be zero. Did I mis-understand the codes?

  2. In creating the Custer_Result dictionary, I found that only eval dataset was involved into consideration:
    (

    PCL/main_pcl.py

    Line 299 in 964da1f

    cluster_result['im2cluster'].append(torch.zeros(len(eval_dataset),dtype=torch.long).cuda())
    )
    So what is the motivation behind this operation, I think we should run it on training set.

@lerogo
Copy link

lerogo commented Dec 4, 2023

Hi,

  1. I notice that the labels created in InfoNCE loss is always a zero-vector:(
    labels = torch.zeros(logits.shape[0], dtype=torch.long).cuda()

    )
    I think this is wrong since otherwise the loss will always be zero. Did I mis-understand the codes?
  2. In creating the Custer_Result dictionary, I found that only eval dataset was involved into consideration:
    (

    PCL/main_pcl.py

    Line 299 in 964da1f

    cluster_result['im2cluster'].append(torch.zeros(len(eval_dataset),dtype=torch.long).cuda())

    )
    So what is the motivation behind this operation, I think we should run it on training set.

Same question.

@Volibear1234
Copy link

Hi,

  1. I notice that the labels created in InfoNCE loss is always a zero-vector:(
    labels = torch.zeros(logits.shape[0], dtype=torch.long).cuda()

    )
    I think this is wrong since otherwise the loss will always be zero. Did I mis-understand the codes?
  2. In creating the Custer_Result dictionary, I found that only eval dataset was involved into consideration:
    (

    PCL/main_pcl.py

    Line 299 in 964da1f

    cluster_result['im2cluster'].append(torch.zeros(len(eval_dataset),dtype=torch.long).cuda())

    )
    So what is the motivation behind this operation, I think we should run it on training set.

For the first question, you can refer to the moco_v1 code, where they use cross-entropy directly for InfoNCE. As for the second question, they use the eval_dataset as negative prototypes, and in the line of code:

output, target, output_proto, target_proto = model(im_q = images[0], im_k = images[1],
                                                   cluster_result = cluster_result, index = index)

the passed index is from the train_loader, so it still computes based on the train_dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants