About Loss of InfoNCE and Cluter_results #22

lsyysl9711 · 2023-03-15T15:57:51Z

Hi,

I notice that the labels created in InfoNCE loss is always a zero-vector:(

PCL/pcl/builder.py

Line 163 in 964da1f

labels = torch.zeros(logits.shape[0], dtype=torch.long).cuda()

)
I think this is wrong since otherwise the loss will always be zero. Did I mis-understand the codes?
In creating the Custer_Result dictionary, I found that only eval dataset was involved into consideration:
(

PCL/main_pcl.py

Line 299 in 964da1f

cluster_result['im2cluster'].append(torch.zeros(len(eval_dataset),dtype=torch.long).cuda())

)
So what is the motivation behind this operation, I think we should run it on training set.

lerogo · 2023-12-04T13:58:57Z

Hi,

I notice that the labels created in InfoNCE loss is always a zero-vector:(

PCL/pcl/builder.py

Line 163 in 964da1f

labels = torch.zeros(logits.shape[0], dtype=torch.long).cuda()

)
I think this is wrong since otherwise the loss will always be zero. Did I mis-understand the codes?

In creating the Custer_Result dictionary, I found that only eval dataset was involved into consideration:
(

PCL/main_pcl.py

Line 299 in 964da1f

cluster_result['im2cluster'].append(torch.zeros(len(eval_dataset),dtype=torch.long).cuda())

)
So what is the motivation behind this operation, I think we should run it on training set.

Same question.

Volibear1234 · 2024-04-13T03:52:25Z

Hi,

I notice that the labels created in InfoNCE loss is always a zero-vector:(

PCL/pcl/builder.py

Line 163 in 964da1f

labels = torch.zeros(logits.shape[0], dtype=torch.long).cuda()

)
I think this is wrong since otherwise the loss will always be zero. Did I mis-understand the codes?

In creating the Custer_Result dictionary, I found that only eval dataset was involved into consideration:
(

PCL/main_pcl.py

Line 299 in 964da1f

cluster_result['im2cluster'].append(torch.zeros(len(eval_dataset),dtype=torch.long).cuda())

)
So what is the motivation behind this operation, I think we should run it on training set.

For the first question, you can refer to the moco_v1 code, where they use cross-entropy directly for InfoNCE. As for the second question, they use the eval_dataset as negative prototypes, and in the line of code:

output, target, output_proto, target_proto = model(im_q = images[0], im_k = images[1],
                                                   cluster_result = cluster_result, index = index)

the passed index is from the train_loader, so it still computes based on the train_dataset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Loss of InfoNCE and Cluter_results #22

About Loss of InfoNCE and Cluter_results #22

lsyysl9711 commented Mar 15, 2023

lerogo commented Dec 4, 2023

Volibear1234 commented Apr 13, 2024

About Loss of InfoNCE and Cluter_results #22

About Loss of InfoNCE and Cluter_results #22

Comments

lsyysl9711 commented Mar 15, 2023

lerogo commented Dec 4, 2023

Volibear1234 commented Apr 13, 2024