Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the usage of two-view refinement #1

Open
Nyohohoho opened this issue Sep 13, 2020 · 4 comments
Open

Questions about the usage of two-view refinement #1

Nyohohoho opened this issue Sep 13, 2020 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@Nyohohoho
Copy link

Thank you very much for your amazing work and kind sharing.

I am sorry to bother you in your busy time, but could you teach me how to use your two-view refinement code?
I currently have an algorithm that extracts matched keypoints from two images.
I save them as:

keypoints1 (with shape (2000, 2)) and keypoints2 (with shape (2000, 2))
2000 means the number of keypoints, and 2 means two coordinates (x and y)

They are already matched, which means (keypoints1[i][0], keypoints1[i][1]) is corresponding to (keypoints2[i][0], keypoints2[i][1]).
In this case, how can I apply your two-view refinement?
Since I am now focusing only on two-view scenario, I hope to try your powerful method.

I will really appreciate it if you can help me with this naive question.
Thank you again for the great contributions to the community.

@mihaidusmanu
Copy link
Owner

mihaidusmanu commented Sep 13, 2020

Currently, there is no easy way to run the refinement on two-views only. I will try to explain it here; the process is similar to compute_match_graph.py.

You first have to call refine_matches_coarse_to_fine as follows:

displacements12 = refine_matches_coarse_to_fine(
    image1, keypoints1,
    image2, keypoints2,
    matches,
    net, device, batch_size, symmetric=False, grid=False
)

where keypoints1 is Ax2, keypoints2 is Bx2, and matches is Mx2 (first and second columns correspond to the feature index in image 1, image 2 respectively). This will return an Mx2 vector corresponding to the "correction" that needs to be applied to the keypoint of the second image for each match.

Please take care at the fact that the keypoints are expected in format x, y where x points right, y points down, while the returned flow is y, x.

To update the keypoints, you can do something along the lines:

dx = displacements12[:, 1]
dy = displacements12[:, 0]
keypoints2[matches2[:, 1], 0] += dx * 16
keypoints2[matches2[:, 1], 1] += dy * 16
# Only valid if feature extraction is ran on full-resolution.
# Otherwise, you also need to multiply by the downsampling factor between the
# original image 2 and image2 used in the call to refine_matches_coarse_to_fine.

In your case, you can set matches to identity, i.e., np.stack([np.arange(2000), np.arange(2000)]).T.

Let me know if you run into any issues!

I will try to prepare a quick script for the two-view case and add it to the repository!

@lihanlun
Copy link

Hi mihaidusmanu
Thanks a lot for your amazing work and kind sharing. I am very sorry to bother you. I found that the output of refine_matches_coarse_to_fine always less than one pixels (test dataset Herzjesu, Fountain). If I add an extra pixels for the key points location, the out of the refine_matches_coarse_to_fine is still less than one pixels. Is that this function can only handle errors of less than one pixel.
I would really appreciate it if you could answer my question.
Thanks again for sharing your code.

@mihaidusmanu
Copy link
Owner

mihaidusmanu commented Dec 17, 2021

Hello. Inside our pipeline, we use 33x33 patches for refinement and the coordinates inside these patches are normalized such that the top left corner is (-1, -1) and bottom right is (1, 1). The outputs of refine_matches_coarse_to_fine is also normalized accordingly. If you want to get pixel displacements, you will need to multiply by 16 (to undo the normalization) and potentially also by the scaling factor used during feature extraction. I have edited my previous comment to address this.

You can refer to the following snippet for instance

keypoints[:, : 2] += displacements * 16
keypoints[:, : 2] += 0.5

Regarding keypoints moving more than one pixel, that is definitely possible, but it heavily depends on the initial features that you are trying to refine: for SIFT there might be very few keypoints that move by a large amount while for learned features the number will be higher..

@lihanlun
Copy link

Oh, I understand. Thank you very very much for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants