Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrastive loss function in 02-holidays-siamese-network.ipynb #2

Open
ellen-liu opened this issue Nov 21, 2017 · 2 comments
Open

contrastive loss function in 02-holidays-siamese-network.ipynb #2

ellen-liu opened this issue Nov 21, 2017 · 2 comments

Comments

@ellen-liu
Copy link

ellen-liu commented Nov 21, 2017

I don't think I completely understand this part of the code:

distance = Lambda(cosine_distance, 
                  output_shape=cosine_distance_output_shape)([vector_left, vector_right])


fc1 = Dense(128, kernel_initializer="glorot_uniform")(distance)
fc1 = Dropout(0.2)(fc1)
fc1 = Activation("relu")(fc1)

pred = Dense(2, kernel_initializer="glorot_uniform")(fc1)
pred = Activation("softmax")(pred)

Where does the contrastive divergence loss come in? I'm trying to understand siamese networks conceptually right now and I'm not sure if my assumptions are correct at this point.

@sujitpal
Copy link
Owner

Hi @ellen-liu sorry my docs and naming are misleading. I used the mnist_siamese.py code in keras/examples as my template, but I framed my own problem with the holiday photos as a 2-class classification problem rather than a regression problem.

In the example, the Lambda computes the final distance between the two images as a continuous value. In my code, the Lambda does an element-wise multiplication between the two image vectors returning a vector of the same size as the combined image vector. Here the name "cosine_distance" is incorrect and misleading, I started with that initially but changed the implementation midway and forgot to change the name. Intuition with product is that it will magnify places where the images are similar to each other. This product vector is then fed into a 2 layer network to produce a 2-class prediction.

@sujitpal
Copy link
Owner

Also, I think my example may not be actually a Siamese network, since there is no weight sharing. In retrospect, what I should have done is something as described in this TripAdvisor Engineering blog post (scroll down to Model Architecture to see the architecture diagram). Here they use pre-trained networks as I did, but the weights in the 3 layer FCN head is shared. Although not explicitly stated, the caption states that the objective is to maximize difference between outputs at merge, so I suspect that the left and right instances of the FCN model will feed into a Lambda layer which will try to optimize the contrastive loss so similar images will return values closer to 1 and dissimilar images will return values closer to 0.

There is also this SO page which provides a Tensorflow implementation of contrastive loss.

Thank you for bringing up this question, I will try out these ideas and put up a new notebook with the implementation soon, then close out this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants