Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add oriented_box_iou_batch function to detection.utils #1502

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

patel-zeel
Copy link

@patel-zeel patel-zeel commented Sep 5, 2024

Description

Implemented oriented_box_iou_batch function in detection.utils as discussed with @LinasKo in #1295.

No additional dependencies are required for this change.

Context

  • This function computes IoU (intersection over union) for oriented/rotated bounding boxes (aka OBB).

Important details for the reviewers:

  1. The implementation is generic and can work for axis-aligned, oriented, and irregular shape polygons. There is no check in the function to verify if the passed box is oriented rectangular bounding box.
  2. polygon_to_mask function requires resolution_wh argument but I have bypassed it by considering a pseudo-image that covers the true and detected boxes. I am assuming that we will always pass box values in the true scale of the image size and not scaled to [0, 1].
max_height = max(boxes_true[:, :, 0].max(), boxes_detection[:, :, 0].max()) + 1
# adding 1 because we are 0-indexed
max_width = max(boxes_true[:, :, 1].max(), boxes_detection[:, :, 1].max()) + 1

Questions for the reviewers:

  1. Once this function is reviewed and merged, would/should it be used as a generic function and replace the box_iou_batch function?
  2. Do we have corner cases where we must use np.nan_to_num(ious) before returning ious?

Type of change

  • New feature (non-breaking change which adds functionality)

How has this change been tested, please provide a testcase or example of how you tested the change?

Thus far, I have not added any tests. I have tested the function with the following code snippet. Showing the examples and plots first followed by the code snippet:

Examples and plots

box_detection = np.array([[1, 1], [2, 0], [4, 2], [3, 3]]) # overlap
box_true = np.array([[1, 0], [0, 1], [3, 4], [4, 3]])

image

box_detection = np.array([[4, 1], [5, 0], [5, 2], [6, 1]]) # no overlap
box_true = np.array([[1, 0], [0, 1], [3, 4], [4, 3]])

image

box_detection = np.array([[1, 0], [0, 1], [3, 4], [4, 3]]) # perfect match

image

Code snippet

import numpy as np
import matplotlib.pyplot as plt
from supervision.detection.utils import polygon_to_mask, oriented_box_iou_batch

# config
box_detection = np.array([[1, 1], [2, 0], [4, 2], [3, 3]]) # overlap
# box_detection = np.array([[4, 1], [5, 0], [5, 2], [6, 1]]) # no overlap
# box_detection = np.array([[1, 0], [0, 1], [3, 4], [4, 3]]) # perfect match

box_true = np.array([[1, 0], [0, 1], [3, 4], [4, 3]])

height = 5
width = 8

# plotting function
def plot_img(img, alpha, ax):
    mappable = ax.imshow(img, interpolation="none", alpha=alpha, vmin=0, vmax=255, cmap="coolwarm")
    ax.set_xticks(np.arange(0.5, width, 1))
    ax.set_xticklabels(np.arange(0, width, 1))
    ax.set_yticks(np.arange(0.5, height, 1))
    ax.set_yticklabels(np.arange(0, height, 1))
    ax.grid(True, which='both', linestyle='--', linewidth=1, color='k')
    return mappable

# create image
img = np.zeros((height, width)).astype(int) + 128

# manual process
mask_true = polygon_to_mask(box_true, (img.shape[1], img.shape[0])).astype(int)
mask_detection = polygon_to_mask(box_detection, (img.shape[1], img.shape[0])).astype(int)
union_pixels = np.sum((mask_true + mask_detection).clip(0, 1))
intersection_pixels = np.sum((mask_true * mask_detection))
iou = intersection_pixels / union_pixels

# Using the function
func_iou = oriented_box_iou_batch(box_true, box_detection)
assert np.isclose(iou, func_iou), f"{iou=}, {func_iou=}"

# plot
fig, axs = plt.subplots(1, 3, figsize=(12, 3.5))

img_true = (img + (mask_true * 255)).clip(0, 255)
plot_img(img_true, 1, axs[0])
axs[0].set_title("True")

img_detection = (img - (mask_detection * 255)).clip(0, 255)
plot_img(img_detection, 1, axs[1])
axs[1].set_title("Detection")

plot_img(img_true, 0.8, axs[2])
plot_img(img_detection, 0.5, axs[2])
axs[2].set_title("Overlay")

manual = f"Manual: Intersection={intersection_pixels}, Union={union_pixels}, IoU={intersection_pixels}/{union_pixels}={intersection_pixels/union_pixels}"
automatic = f"{oriented_box_iou_batch(box_true, box_detection)=}"
fig.suptitle(f"{manual}\n{automatic}", ha='center')

Any specific deployment considerations (this section is not yet edited)

For example, documentation changes, usability, usage/costs, secrets, etc.

Docs (this section is not yet edited)

  • Docs updated? What were the changes:

@CLAassistant
Copy link

CLAassistant commented Sep 5, 2024

CLA assistant check
All committers have signed the CLA.

@LinasKo
Copy link
Collaborator

LinasKo commented Sep 5, 2024

Excellent work, @patel-zeel! I especially like the visuals.

To answer your questions:

  1. No, the usage of box_iou_batch and mask_iou_batch in the codebase should remain unchanged. Especially the former, as it provides a faster and less complex implementation than the one required by OBB. Keeping oriented_box_iou_batch separate also allows its implementation to vary independently of others.
  2. I wonder how small an OBB can be. Suppose you had an xyxyxyxy where all x values are equal, and all y values are equal too. I don't know if a model can produce that in reality, let's try to support it. Could you check what masks are created and what IoU is produced?

@patel-zeel
Copy link
Author

Excellent work, @patel-zeel! I especially like the visuals.

Thank you, @LinasKo. While making the plots, I realized that gridlines pass from the middle of a pixel, making it hard to visually count union, intersection, and IoU. Thus, I shifted the gridlines by half of the pixel size to make sure pixels overlap with the cells of the grid and not on the intersections of gridlines. It was fun working on this!

Another question:

  • Do we need to / Can we vectorize this further? As per my trials, cv2.fillPoly was not vectorizable.

Now, coming to other points raised by you:

No, the usage of box_iou_batch and mask_iou_batch in the codebase should remain unchanged. Especially the former, as it provides a faster and less complex implementation than the one required by OBB. Keeping oriented_box_iou_batch separate also allows its implementation to vary independently of others.

Yes, that makes sense! Thank you.

I wonder how small an OBB can be. Suppose you had an xyxyxyxy where all x values are equal, and all y values are equal too. I don't know if a model can produce that in reality, let's try to support it. Could you check what masks are created and what IoU is produced?

Given all the same values, cv2.fillPoly creates a filter of size 1 pixel, which should be error-free. I am showing two more examples here:

box_detection = np.array([[0, 0], [0, 0], [0, 0], [0, 0]])
box_true = np.array([[0, 0], [0, 0], [0, 0], [0, 0]])

image

box_detection = np.array([[1, 1], [1, 1], [1, 1], [1, 1]])
box_true = np.array([[0, 0], [0, 0], [0, 0], [0, 0]])

image

@patel-zeel
Copy link
Author

@LinasKo, this is a gentle ping to let you know that I can make the desired changes before you again review this PR. Feel free to let me know whenever you take a look.

@LinasKo
Copy link
Collaborator

LinasKo commented Sep 19, 2024

Hi @patel-zeel, I have it in my sights - I'll 100% include it before the new release, and very likely in the next few days. Apologies for the long wait!

@patel-zeel
Copy link
Author

Thank you for the quick response, @LinasKo. No worries at all.

@LinasKo
Copy link
Collaborator

LinasKo commented Sep 19, 2024

Hi @patel-zeel, here's my feedback:

  1. Please pull before making any edits - I pushed a small change, adding the new function to the global scope and the docs.
  2. xyxyxyxy are expected to be of type np.float32, but an error is thrown when that is the case.
  3. I don't think we should be adding +1 to the width / height. The result it too large. An xyxy box [0, 0, 0, 0] should have area of 0, and so should a mask that is entirely empty. We'd like OBB to match that behaviour.

Here's the Colab I worked with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants