Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training accuracy does not change while training entire resnet18 from scratch #19

Open
sleepingcat4 opened this issue Aug 19, 2023 · 5 comments

Comments

@sleepingcat4
Copy link

I was using c2q_transfer_learning_ants_bees.ipynb this notebook and I let all the layers of resnet18 to train from scratch on the ants_bees dataset. And I did not change anything else on the code but now while training the model, the accuracy does not improve more than 55%.

Can you guys suggest what might be causing this?

Code I changed:

model_hybrid = torchvision.models.resnet18(pretrained=False)

for param in model_hybrid.parameters():
    param.requires_grad = True

if quantum:
    model_hybrid.fc = Quantumnet()

elif classical_model == '512_2':
    model_hybrid.fc = nn.Linear(512, 2)

elif classical_model == '512_nq_2':
    model_hybrid.fc = nn.Sequential(nn.Linear(512, n_qubits), torch.nn.ReLU(), nn.Linear(n_qubits, 2))

elif classical_model == '551_512_2':
    model_hybrid.fc = nn.Sequential(nn.Linear(512, 512), torch.nn.ReLU(), nn.Linear(512, 2))

# Use CUDA or CPU according to the "device" object.
model_hybrid = model_hybrid.to(device)

Training Log:

Training started:
/usr/local/lib/python3.10/dist-packages/torch/optim/lr_scheduler.py:139: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
Phase: train Epoch: 1/30 Loss: 0.7062 Acc: 0.4877        
Phase: val   Epoch: 1/30 Loss: 0.7001 Acc: 0.4575        
Phase: train Epoch: 2/30 Loss: 0.6950 Acc: 0.5041        
Phase: val   Epoch: 2/30 Loss: 0.6983 Acc: 0.4575        
Phase: train Epoch: 3/30 Loss: 0.6939 Acc: 0.5041        
Phase: val   Epoch: 3/30 Loss: 0.6965 Acc: 0.4575        
Phase: train Epoch: 4/30 Loss: 0.6945 Acc: 0.5041        
Phase: val   Epoch: 4/30 Loss: 0.6950 Acc: 0.4575        
Phase: train Epoch: 5/30 Loss: 0.6935 Acc: 0.5041        
Phase: val   Epoch: 5/30 Loss: 0.6951 Acc: 0.4575        
Phase: train Epoch: 6/30 Loss: 0.6935 Acc: 0.5041        
Phase: val   Epoch: 6/30 Loss: 0.6947 Acc: 0.4575        
Phase: train Epoch: 7/30 Loss: 0.6934 Acc: 0.5041
@CatalinaAlbornoz
Copy link

Hi @sleepingcat4,

There are many reasons why this could be happening.
The quantum model may not be expressive enough to do the training from scratch, there may be issues with your optimizer, you may need to modify the model architecture or it may be something else entirely.

A while ago it was pointed out that both the classical and quantum implementations of transfer learning had some issues (see here) so this may or may not be related to that.

This is a question where the answer will require you to do some research and try out different options to see if there's a solution.

I hope this motivates you to look deeper into the problem and let us know if you find the answer!

@sleepingcat4
Copy link
Author

thanks for your response. I suspected the same. I am aware of the limitations of transfer-learning in Classical Deep Learning but its interesting know about Quantum faces the same problem. I just wanted to ask, If I want to modify the Quantum Network, what is the best approach in modifying the network?

Should I add more VQC/Classical DL equivalent to CNN layers or you suggest something different? I will also look into more research papers as well.

@CatalinaAlbornoz
Copy link

Hi @sleepingcat4 ,

Adding more layers in QNNs doesn't necessarily help the way it does in classical CNNs. You can try this but I doubt the solution lies here. Something you can try (although I don't specifically know how you would do it) is changing the cost function. You can take inspiration from our demo on local cost functions. This is really just a guess so please make sure to believe the papers you read more than my guess 😄 .

@sleepingcat4
Copy link
Author

@CatalinaAlbornoz I think I have identified the problem. The way the tutorial was written, it oversimplified a lot of concepts in QCNN.

I read a few papers, don't really have a gold standard in terms of What QCNN should be, but researchers who received either 90/100% accuracy, exploited Hilbert Space and used MERA method to train the Quantum Network from scratch.

Otherwise took each pixel values and converted into angles via arctan() function.

And thanks for the Barren Platues technique. I used it, previously but completely forgot about that.

I would love to hear your thoughts on this : )

@CatalinaAlbornoz
Copy link

That's very interesting @sleepingcat4!
In line with what you mention, the paper here and the PennyLane demo on the variational classifier (part 2) use a more complicated state preparation where you have to really pre-process your data classically before doing any training. The graphs in the demo show very clearly how important this pre processing is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants