-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ✨ scale_detections function added to adjust bbox,masks,obb for scaled images #1711
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
from typing import Tuple | ||
|
||
import cv2 | ||
import numpy as np | ||
|
||
from supervision.detection.core import ORIENTED_BOX_COORDINATES, Detections | ||
|
||
|
||
def scale_detections( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest renaming to either I'd like some advice from @SkalskiP here - I've explained more in the global review comment. |
||
detections: Detections, | ||
letterbox_wh: Tuple[int, int], | ||
resolution_wh: Tuple[int, int], | ||
) -> Detections: | ||
""" | ||
This function scale the coordinates of bounding boxes and optionally scales the | ||
masks,oriented bounding boxes to fit a new resolution, taking into account the | ||
letterbox padding applied during the resizing process and return Detections object. | ||
|
||
Args: | ||
detections (Detections): The Detections object to be scaled. | ||
letterbox_wh (Tuple[int, int]): The width and height of the letterboxed image. | ||
resolution_wh (Tuple[int, int]): The target width and height for scaling. | ||
|
||
Returns: | ||
Detections: A new Detections object with scaled to target resolution. | ||
""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The doc should come with an example showing how to use this. It should make it evident that letterboxing can be undone using this operation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should also find a place in the docs where we could showcase the scenario of:
Without this, discovery of the method will be very low. |
||
|
||
if detections.xyxy is None: | ||
return detections | ||
|
||
input_w, input_h = resolution_wh | ||
letterbox_w, letterbox_h = letterbox_wh | ||
|
||
target_ratio = letterbox_w / letterbox_h | ||
image_ratio = input_w / input_h | ||
|
||
if image_ratio >= target_ratio: | ||
width_new = letterbox_w | ||
height_new = int(letterbox_w / image_ratio) | ||
else: | ||
height_new = letterbox_h | ||
width_new = int(letterbox_h * image_ratio) | ||
|
||
scale = input_w / width_new | ||
padding_top = (letterbox_h - height_new) // 2 | ||
padding_left = (letterbox_w - width_new) // 2 | ||
|
||
boxes = detections.xyxy.copy() | ||
boxes[:, [0, 2]] -= padding_left | ||
boxes[:, [1, 3]] -= padding_top | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should use |
||
boxes[:, [0, 2]] *= scale | ||
boxes[:, [1, 3]] *= scale | ||
|
||
scaled_mask = None | ||
if detections.mask is not None: | ||
masks = [] | ||
for mask in detections.mask: | ||
mask = mask[ | ||
padding_top : padding_top + height_new, | ||
padding_left : padding_left + width_new, | ||
] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use |
||
scaled_mask_i = cv2.resize( | ||
mask.astype(np.uint8), | ||
(input_w, input_h), | ||
interpolation=cv2.INTER_LINEAR, | ||
).astype(bool) | ||
masks.append(scaled_mask_i) | ||
scaled_mask = np.array(masks) | ||
|
||
if ORIENTED_BOX_COORDINATES in detections.data: | ||
obbs = np.array(detections.data[ORIENTED_BOX_COORDINATES]).copy() | ||
obbs[:, :, 0] -= padding_left | ||
obbs[:, :, 1] -= padding_top | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use |
||
obbs[:, :, 0] *= scale | ||
obbs[:, :, 1] *= scale | ||
detections.data[ORIENTED_BOX_COORDINATES] = obbs | ||
|
||
return Detections( | ||
xyxy=boxes, | ||
mask=scaled_mask, | ||
confidence=detections.confidence, | ||
class_id=detections.class_id, | ||
tracker_id=detections.tracker_id, | ||
data=detections.data, | ||
metadata=detections.metadata, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onuralpszr can we change function name to specific to letterbox otherwise it will confuse users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hardikdava scale_letterbox_detections ? (maybe) what do you think @LinasKo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Un-letterboxing feels like a very narrow use case.
We should provide general functions which also enable letterbox reversal, and highlight that case in the examples & docs.
While it might be frustrating, let me take the time to review this in-depth over the weekend & start of next week, for reversing API decisions is hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LinasKo I can add enum parameter and change other two to src_res and target_res params and based on enum parameter it can be letterbox depended scale or normal scale, I can also add other extra cases to handle normal scale cases without "padding" ?