-
Notifications
You must be signed in to change notification settings - Fork 28
classification loss - negative sample labels #6
Comments
@DishaDRao I have fully trained with the existing code and got strange output which doesn't make sense at all. Did you get some result on it? |
Well, I did not try this code. However, I went through the code base of the original implementation ( the winning team) and understood where this labelling came from. Basically, in the original implementation, the target labels for negative anchor boxes ('neg_labels') are given a label '-1'. Hence it makes sense to write 'neg_labels + 1' during the loss computation to make it 0. ( 0 stands for no object and 1 stands for an object). However, in the current code base, the target labels for negative anchor boxes are already given a label of '0'. So it doesn't make sense to write 'neg_labels +1' during the loss computation. In short, I think it's a mistake here and I suggest to run this code without adding 1 to neg_labels. Hope this works. If not, then it's an issue in some other part of the code! |
Hello kindly help me the testing codes (How to test the training model) so as to get predicted nodules. |
Testing codes please |
@DishaDRao However, though I changed the loss function, I couldn't get any meaningful result. Yes, I guess some other part have an issue. Let me point out possible one. The shape of the output tensor is (32, 32, 32, 3, 5), And the values inside this output tensor is repeated every single cell. I think this is due to very heavily imbalanced positive vs. negative ratio inside the target tensor. After a good iteration of training, it becomes predicting all values to negative. To solve this, maybe multiple anchor assignment to the GT nodule, and random sampling of negative target cell would be necessary. Could you tell me if you have suggestion or any other code base you recommend? |
I didn't make test code for this model. Without solving this, test is meaningless. Anyway, this is my test scheme I was going to do after it gives meaningful output :
You can download sampleSubmission.csv and noduleCADEvaluationLUNA16.py from the LUNA16's official site. |
The problem of class imbalance is actually taken care of in the loss function. Even though the target lables may contain the ratio (positive to negative) that you have mentioned, the loss function takes care of this by employing 'negative hard mining' (similar to your idea of random sampling of the negatives) which restricts the number of negative anchor boxes to 2 ( depending on the batch size) per mini batch. That means the network sees an equal (or 1:2) ratio of positive and negative anchor boxes during the loss computation. I strongly believe the problem in this code is how the rest of the targets are labelled. The anchor boxes for bounding box regresssion should be labelled based on its IOU and center-to-center parameterization with a ground truth box (as per the standard faster-rcnn). I don't see how that is employed in this code. If the target itself doesn't have the right (position) labels, then I wouldn't expect to get any meaningful results after training. ( given the benifit of the doubt, even if the targets are labelled correctly, the testing requires de-parameterization of the predictions which can be done only if the target computation is deciphered) In short, I wouldn't use this code for traning. This repository is nice to get an understanding on the preprocessing and augmentation part, but for actual implemetation I would recommend to check out the original code bases from (lfz/DSB2017, or 'wentaozhu/DeepLung'). They both are extremely similar, however the latter repo is simpler, it worked for me! (ps. this repo has a google collab provided at the end. Howerver, I didn't use it nor check it out. I wanted a deeper understanding, hence skipped it entirely ;) ) |
Thank you very much for your advice. I'm using 'wentaozhu/DeepLung' repository for the training & evaluation with LUNA16 dataset, and starts getting meaningful FROC results. Many thanks! :-) |
Hello @DishaDRao and @naoe1999 Kindly help please. I tried to make follow up on your conversion and advises, and I went through wentaozhu/DeepLung repository and unfortunately at the LOSS CODES I find the same thing at the labels(+1). BUT during training with that codes, I found that the loss does not decreasing, I am not sure if I have to remove (+1) in labels in the codes. |
@DishaDRao did you refer at the point below? This is from data.py file from wentaozhu repository. class LabelMapping(object):
|
Hi, If you're following wentaozhu/DeepLung respository, you need not change anything in the loss function nor in data.py function. The negative samples are labelled in a correct manner. As mentioned in my previous comment, the +1 in the loss function is to make the nagative labels to 0. So, it's for a purpose! So, through Wentazo's codes, the loss error that you're facing must be due to be something else. Probably your dataset/training method. May be you should look into their issues section. |
@DishaDRao I am training through google collab, what I have done is to reduce batch size, also what else I have done is I am not using Dataparalle in training because I use single gpu. Furthermore, chenges in the pytorch version must have some issues like int issues need to put in some areas. Ihave done it for almost two months now Iam getting crayz. I started the process again and again but no success. If u dont mind, share with me your data.py, main.py and layers.py files. |
This is the change in training I have done def train(data_loader, net, loss, epoch, optimizer, get_lr, save_freq, save_dir):
|
In the data file also, ...
|
Hi,
In the following snippet from the loss.py file:
''classify_loss = 0.5 * self.classify_loss(
pos_prob, pos_labels[:, 0]) + 0.5 * self.classify_loss(
neg_prob, neg_labels + 1) ''
why is the target label for sigmoid loss of negative samples given as ' neg_labels + 1' ?
Shouldn't it be just 'neg_labels'? (as the value of 'neg_labels' is initilaized as 0 in itself)
The text was updated successfully, but these errors were encountered: