Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added heuristic for bounding bbox ordering #835

Merged
merged 7 commits into from
Feb 26, 2023
Merged

Conversation

MLDovakin
Copy link
Contributor

I have been using your opensource framework for quite some time and recently I needed to do a text detection task. The important part is that I had to keep the word order from left to right, but I saw in the documentation that there was no such feature.

I first tried using opencv bounding ordering but that didn't work because there were too many overlapping bounding bboxes. Then I tried to sort by height and set the Y threshold as the bounding bbox should differ minimally in height. This gave much better results than other methods and I would like to add such functionality here

Here are examples of using Y-threshold ordering

  1. The result of get_sliced_prediction()
    Снимок экрана от 2023-02-23 17-26-19

2, Draw ordering bboxes result
Снимок экрана от 2023-02-23 17-29-51

Plotting code


def bbox_sort(a, b, thresh):
    bbox_a = a
    bbox_b = b
    
    if abs(bbox_a[1] - bbox_b[1]) <= thresh: 
        return bbox_a[0] - bbox_b[0]
    
    return bbox_a[1] - bbox_b[1]

my_list = []

for ann in result.to_coco_annotations():
  ##type int so that there are no opencv errors when drawing lines

  current_bbox = ann['bbox']
  x = int(current_bbox[0])
  y = int(current_bbox[1])
  w = int(current_bbox[2])
  h = int(current_bbox[3])
  
  my_list.append((x, y, w, h))

thresh = 10
cnts = sorted(my_list, key=cmp_to_key(lambda a,b: bbox_sort(a, b, thresh)))

img = cv2.imread(f"/content/detect_images/output_01.jpg")
red = [0,0,255]

k = 0

font                   = cv2.FONT_HERSHEY_SIMPLEX
bottomLeftCornerOfText = (10,500)
fontScale              = 1
thickness              = 1
lineType               = 2

for i in cnts:
  q = cv2.circle(img, (i[0], i[1]), 5, red, -1)
  q = cv2.putText(q, f'{k}', (i[0],i[1]),font,1,(120,166,50),2)
  k += 1

cv2_imshow(q)

@fcakyon
Copy link
Collaborator

fcakyon commented Feb 23, 2023

Can you please reformat your code and commit&push again as detailed in the contributing section of the readme :)

@MLDovakin MLDovakin closed this Feb 24, 2023
@MLDovakin MLDovakin reopened this Feb 24, 2023
@fcakyon fcakyon added this pull request to the merge queue Feb 26, 2023
Merged via the queue into obss:main with commit 761c0ff Feb 26, 2023
@fcakyon
Copy link
Collaborator

fcakyon commented Feb 26, 2023

@MLDovakin thanks alot for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants