Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with tracker.update_with_detections(detections) #1215

Open
1 of 2 tasks
CodingMechineer opened this issue May 21, 2024 · 18 comments
Open
1 of 2 tasks

Problems with tracker.update_with_detections(detections) #1215

CodingMechineer opened this issue May 21, 2024 · 18 comments
Labels
bug Something isn't working

Comments

@CodingMechineer
Copy link

CodingMechineer commented May 21, 2024

Search before asking

  • I have searched the Supervision issues and found no similar bug report.

Bug

Somehow, I loose predicted bounding boxes in this line:

tracker.update_with_detections(detections)

In the plot from Ultralytics, everything is fine. Though, after the line above gets executed, I loose some bounding boxes. In this example, I loose two.

That's the plot from Ultralytics, how it should be:
image

That's the plot after the Roboflow labling, some predictions are missing:
image

Can somebody help me with this issue?

Environment

  • Supervision 0.20.0
  • Python 3.12.3
  • Ultralytics 8.2.18

Minimal Reproducible Example

Code:

import cv2
import supervision as sv
from ultralytics import YOLO

model_path = "path/to/your/model.pt"
video_path = "path/to/your/video.mp4"

cap = cv2.VideoCapture(video_path)
model = YOLO(model_path)
box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()
tracker = sv.ByteTrack()

while True:
    ret, frame = cap.read()
    
    results = model(frame, verbose=False)[0]
    print(f"CLS_YOLO-model: {results.boxes.cls}")
    
    results_2 = model.predict(frame,     
                        show=True, # The plot from the Ultralytics library
                        conf = 0.5,
                        save = False,
                        )
    
    detections = sv.Detections.from_ultralytics(results)
    print(f"ClassID_Supervision_1: {detections.class_id}") # Between this and the next print, predictions are lost
    
    detections = tracker.update_with_detections(detections) # The detections get lost here
    
    labels = [
        f"{results.names[class_id]} {confidence:0.2f}"
        for confidence, class_id
        in zip(detections.confidence, detections.class_id)
    ]
    
    print(f"ClassID_Supervision_2: {detections.class_id}") # Here two predictions from the Ultralytics model are lost
    
    annotated_frame = frame.copy()
    
    annotated_frame = box_annotator.annotate(
        annotated_frame,
        detections
        ) 
    
    labeled_frame = label_annotator.annotate(
        annotated_frame,
        detections,
        labels
        )
    
    print(f"ClassID_Supervision_3: {detections.class_id}")
    print(f"{len(detections)} detections, Labels: {labels}", )
        
    cv2.imshow('Predictions', labeled_frame) # The with Roboflow generated frame

cap.release()
cv2.destroyAllWindows()

Prints in console:

CLS_YOLO-model: tensor([1., 1., 1., 1.], device='cuda:0') --> Class ID's from the predicted bounding boxes
ClassID_Supervision_1: [1 1 1 1] --> Converted into Supervision
ClassID_Supervision_2: [1 1] --> After the tracker method class ID's are lost
ClassID_Supervision_3: [1 1]
2 detections, Labels: ['Spot 0.87', 'Spot 0.86']

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@CodingMechineer CodingMechineer added the bug Something isn't working label May 21, 2024
@LinasKo
Copy link
Collaborator

LinasKo commented May 21, 2024

Hi @CodingMechineer 👋

Let's do one quick test - does installing supervision==0.21.0.rc5 change anything?

@SkalskiP
Copy link
Collaborator

@CodingMechineer, we accidentally shipped a tracking bug in supervision==0.20.0. Try using supervision==0.19.0 or supervision==0.21.0.rc5 pre-release.

@CodingMechineer
Copy link
Author

@LinasKo @SkalskiP I installed version supervision==0.21.0.rc5 and supervision==0.19.0. Though with both versions I have the same problem.

Top: YOLO predictions
Bottom: Supervision tracker
Screenshot 2024-05-21 153420

@SkalskiP
Copy link
Collaborator

@CodingMechineer Could you share with us the exact version of the model and the video file you are using?

@CodingMechineer
Copy link
Author

CodingMechineer commented May 21, 2024

@SkalskiP Sure! Please let me know if there is an issue on my end.
I zipped the video, the model, the code and a requirements.txt file. Unfortunately, the file sizes from the video and the model are too big. Thus, GitHub doesn't let me upload everything.

https://1drv.ms/u/s!AjTS76M8DCeYm8djbuYtiGXfXFNvsQ?e=GleN38

@LinasKo
Copy link
Collaborator

LinasKo commented May 22, 2024

Hi @CodingMechineer 👋

Tracker uses detections overlap and motion model prediction to estimate which detection represent the same object in sequential frames. It then filters out what it can't match. While the details are a bit complicated, a quick way to influence the result is to increase the object area shown to the tracker.

So, my quick suggestion: check if padding the boxes solves your problem.

That means, insert detections.xyxy = sv.pad_boxes(detections.xyxy, px=10, py=10) between the calls to from_ultralytics and update_with_detections.

Here's how it looks on my end. This way, all holes are detected, even after tracking.
image

Does that solve your problem? 😉

@CodingMechineer
Copy link
Author

Unfortunately not, some spots and peanuts are still not tracked.

image

@LinasKo
Copy link
Collaborator

LinasKo commented May 23, 2024

That's unfortunate. Here's the next steps to try:

  1. I assume you're already using supervision==0.21.0.rc5 - only later versions have pad_boxes. If not, you should switch to supervision==0.21.0.rc5.
  2. Next, trying out a few values of parameters might help, especially since I think padding already captures 99% of the cases (on my machine the padding worked really well, applied the same way you did)
    1. Try changing px and py in pad_boxes
    2. Try setting a different track_activation_threshold and minimum_matching_threshold in tracker. If the expected FPS is different than 30, you should set it too, as a tracker argument.

Are you always running this on videos, or on a stream too? If both, I wonder if it performs similarly on the live stream and the video of the same stream.

@SkalskiP
Copy link
Collaborator

@LinasKo, do you have any idea why this happens?

@LinasKo
Copy link
Collaborator

LinasKo commented May 23, 2024

@SkalskiP, no. I dug for an hour or so, plotted sequential detections (there's typically >50% overlap, yet they disappear). I played with some values, but I'd need to plot/print out steps of the algorithm to learn how it sees the world.

@SkalskiP
Copy link
Collaborator

@LinasKo this should not happen. I'm worried because I have no idea why it's happening. @rolson24 would you have time to take a look?

@CodingMechineer
Copy link
Author

That's unfortunate. Here's the next steps to try:

  1. I assume you're already using supervision==0.21.0.rc5 - only later versions have pad_boxes. If not, you should switch to supervision==0.21.0.rc5.

  2. Next, trying out a few values of parameters might help, especially since I think padding already captures 99% of the cases (on my machine the padding worked really well, applied the same way you did)

    1. Try changing px and py in pad_boxes
    2. Try setting a different track_activation_threshold and minimum_matching_threshold in tracker. If the expected FPS is different than 30, you should set it too, as a tracker argument.

Are you always running this on videos, or on a stream too? If both, I wonder if it performs similarly on the live stream and the video of the same stream.

I may run this on a stream in the future. Currently, I only run it on video files. To do the same task, I also tried the Ultralytics library. That works completely fine and I continue with that.

The code looks something like this:

from ultralytics import YOLO

model_path = 'best.pt'
video_path = '../001 - DATA/099 - Test_videos/Test_video_0.avi'

cap = cv2.VideoCapture(video_path)
model = YOLO(model_path)

while cap.isOpened():
  success, frame = cap.read()
  results = model.track(frame, persist=True)

  if results[0].boxes.id is not None:
    boxes = results[0].boxes.xyxy.cpu()
    track_ids = results[0].boxes.id.int().cpu().tolist()
    clss = results[0].boxes.cls.cpu().tolist()
    confs = results[0].boxes.conf.cpu().tolist()

  # Do all the plotting and processing

Maybe this can help you. Please let me know if I can do anything else for you.

@SkalskiP
Copy link
Collaborator

@CodingMechineer btw if you use model.track in ultralytics, you can still use detections = sv.Detections.from_ultralytics(results) and that tracker_id will be extracted from result object.

@SkalskiP
Copy link
Collaborator

SkalskiP commented Jun 5, 2024

@LinasKo and @CodingMechineer, is that issue still active?

@LinasKo
Copy link
Collaborator

LinasKo commented Jun 5, 2024

Yup, we'll need to look at this in the future.

@rolson24
Copy link
Contributor

@LinasKo this should not happen. I'm worried because I have no idea why it's happening. @rolson24 would you have time to take a look?

Hi @SkalskiP,
Sorry I have been super busy with school. I can take a look at this and try to see what is going on.

@rolson24
Copy link
Contributor

Hi @CodingMechineer, @SkalskiP, and @LinasKo

I took a look at it with @CodingMechineer's code and it seems like the tracker is working as expected. It unfortunately is failing for @CodingMechineer because the motion predictor (kalman filter) in the tracker uses the first 2 frames of a track to determine the speed and direction of an object, and so for the first association between the initial detection frame and the second detection frame the tracker uses the overlap between the two frames, NOT between the prediction and the second frame. In this specific video, the objects move very quickly and so for the first 2 frames of some tracks, there is almost no overlap (the ones that are not tracked are <30%, needs to be >30%), meaning no track gets established. This then means the tracker must start over with an entirely new track for that object because it could not establish a motion model, and so for the next frame it has no hope of the overlap of frame 1 and frame 3 being more than 30% in this example.

I improved the performance for this example by setting the minimum_matching_treshold to 0.9 when initializing ByteTrack() and I further improved the performance by adjusting the parameter for activating a new track to be 0.9 (initial 2 frame overlap only needs to be greater than 10% rather than 30%)
This second change is in the source code and was mainly just to test my theory of what is happening and I would not recommend changing it.

For your specific example @CodingMechineer, if the tracking performance is essential to your project, I believe you would either need to record at a higher framerate or you would need to slow down the device being used to sort peanuts. Both these options would reduce the amount an object moves between frames. The last thing we could try is to initialize the motion model to be going in general from left to right if this specific location is the only place you need to run this code on. This would allow the tracker to better pick up on objects in the first two frames. Unfortunately it would require changing the source code a bit and messing around with the kalman filter, something that I am willing to help with, but we would probably not be able to put into the supervision API.

@LinasKo @SkalskiP This issue seems to come from how ByteTrack is designed to be flexible for varied and unexpected tracking scenarios. If we wanted to fix the tracking for these types of repetitive, predictable computer vision tasks, it would be better to design a second tracker that can better handle high-speed and predictable types of motion for tasks like this.

@CodingMechineer
Copy link
Author

Thank you for your detailed investigation @rolson24!

I made the same observation regarding the framerate as you explained. With the video from the example, I had the stated problem only with the Roboflow library but not with Ultralytics. Though, when the device is running faster with the same framerate, I had the same problems with the Ultralytics library. Thus, I must make sure the movements from the objects between the frames is small enough, so that the tracking works satisfyingly. Probably the object movement between the frames was between a threshold so that it worked with Ultralytics but not with Roboflow.

In summary, I need to make sure my framerate is high enough so that the object overlap is big enough and the tracking works accordingly. Hence, there is no need to change the source code.

Thanks everybody for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants