-
-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YOLOv5 (6.0/6.1) brief summary #6998
Comments
@glenn-jocher hi, today I briefly summarized yolov5(v6.0). Please help to see if there are any problems or put forward better suggestions. Some schematic diagrams or contents will be added later. Thank you for your great work. |
hi, 'prediction layers(P3, P4, P5) are weighted differently', how do I find it in the code, and further, modify it? |
@WZMIAOMIAO thx! |
@WZMIAOMIAO awesome summary, nice work! @zlj-ky yes the balancing parameters are there, we tuned these manually on COCO. The idea is to balance losses from each layer (just like we balance losses across loss components (box, obj, class)). The reason I didn't turn these into learnable weights is that as absolute values the gradient would always want to drag them to zero to minimize the loss. I suppose we could constantly normalize them so they all sum to 1 to avoid this effect. Might be an interesting experiment, and this might help the balancing adapt better to different datasets and image sizes etc. |
@glenn-jocher Could we add this brief summary to the document? |
@WZMIAOMIAO yes maybe it's a good idea to document this somewhere. Which document do you mean though? |
@glenn-jocher I think it could be added to the |
* Add Architecture Summary to README Tutorials Per #6998 (comment) * Update README.md
@WZMIAOMIAO all done in #7146! Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐ |
@HERIUN built_targets() implements an anchor-label assignment strategy so we can calculate the losses between assigned anchor-label pairs. |
@glenn-jocher what's the adjustment strategy for the balancing parameters?How to change them to learnable weights?
@glenn-jocher what's the adjustment strategy for the balancing parameters?How to change them to learnable weights? |
@xinxin342 the balance params are here, you'd have to convert them to nn.Parameter types assigned to an existing class and set their compute grad to True: Line 112 in c9a3b14
|
@glenn-jocher |
@zlj-ky that seems like a good approach, but you might need to place self.w inside the model so it's affected by model.train(), model.eval(), etc. You can just place it inside models.yolo.Detect and then access it like this. (Note your code is out of date): class ComputeLoss:
sort_obj_iou = False
def __init__(self, model, autobalance=False):
device = next(model.parameters()).device # get model device
h = model.hyp # hyperparameters
# Define criteria
BCEcls = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['cls_pw']], device=device))
BCEobj = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([h['obj_pw']], device=device))
# Class label smoothing https://arxiv.org/pdf/1902.04103.pdf eqn 3
self.cp, self.cn = smooth_BCE(eps=h.get('label_smoothing', 0.0)) # positive, negative BCE targets
# Focal loss
g = h['fl_gamma'] # focal loss gamma
if g > 0:
BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)
m = de_parallel(model).model[-1] # Detect() module
self.balance = {3: [4.0, 1.0, 0.4]}.get(m.nl, [4.0, 1.0, 0.25, 0.06, 0.02]) # P3-P7
self.ssi = list(m.stride).index(16) if autobalance else 0 # stride 16 index
self.BCEcls, self.BCEobj, self.gr, self.hyp, self.autobalance = BCEcls, BCEobj, 1.0, h, autobalance
self.na = m.na # number of anchors
self.nc = m.nc # number of classes
self.nl = m.nl # number of layers
self.anchors = m.anchors
self.w = m.w # <------------------------ NEW CODE
self.device = device This might or might not work as I don't know if this will create a copy or access the Detect parameter. Even if you get this to work though It's not clear that these are learnable parameters as I'm not sure if they can be correlated to the gradient directly, i.e. the optimizer seeks to reduce loss, so the rebalance may just weigh higher the lower loss components to reduce loss, which may not have the desired effect. The same concept applies to anchors, which don't seem learnable either during training. |
@glenn-jocher Thank you for sharing your views on this matter and for your patient guidance. I will try it latter. |
@WZMIAOMIAO @glenn-jocher Hi, thank for your nice work! There I have two questions, first, how could I print every layers outputs.(Here I'd like to change first layer kernel to small size that it's possible for small object detection.) Next, I also want to add a output for object tracing, ([x,y,w,h,nc] -> [x, y, w, h, nc, id]) but I don't know use which loss function to do it. |
Hi @glenn-jocher |
@kadirnar BiFPN and PANet are nearly identical, in a P3-P5 output model the only difference is a single shortcut. There are versions of all 3 heads available here: As always all design decisions are based on empirical results. |
Hello,can we get the results of the ablation experiment?Such as SPP2SPPF、Focus2Conv mAP results on big datasets |
@divided-by-7 I'm sorry, we don't this R&D saved in a presentable manner. |
@WZMIAOMIAO Could you please summarize the YOLOv5 Instance Segmentation Model Structure? especially the keywords definition of output0 float32[1,25200,117] and output1 float32[1,32,160,160]. Thank you very much in advance! |
Dear @glenn-jocher @WZMIAOMIAO |
Hi! What do k, s, p, and c represent in the model structure, respectively? |
This is a simple question. k = kernel size, s = stride, p = padding, c = channel dims |
Okay, thank you very much! |
Hello @glenn-jocher or anyone who knows the answer. I am trying to understand the build targets process a little more. When you say GTx%1>0.5 and GTy%1>0.5 is the % just the modulus? If it is the modulo operator, then why is this used? Thanks, Karl Gardner |
@WZMIAOMIAO @glenn-jocher or anyone who knows. I am trying to understand more about the model structure. Is there an article that discusses and explains the YOLOv5 structure? Thanks! |
* Add Architecture Summary to README Tutorials Per ultralytics/yolov5#6998 (comment) * Update README.md
Hi @glenn-jocher can i know what is the formula if input image 640x640x3 becomes 320x320x64 with k=3 s=2 p=1? |
@gracesmrngkr this transformation is governed by the following formula: [ So in this case, with an input size of 640 and a kernel size of 3, a stride of 2, and padding of 1, the output size would be 320. |
Content
1. Model Structure
YOLOv5 (v6.0/6.1) consists of:
New CSP-Darknet53
SPPF
,New CSP-PAN
YOLOv3 Head
Model structure (
yolov5l.yaml
):Some minor changes compared to previous versions:
Focus
structure with6x6 Conv2d
(more efficient, refer Is the Focus layer equivalent to a simple Conv layer? #4825)SPP
structure withSPPF
(more than double the speed)test code
result:
2. Data Augmentation
3. Training Strategies
4. Others
4.1 Compute Losses
The YOLOv5 loss consists of three parts:
4.2 Balance Losses
The objectness losses of the three prediction layers(
P3
,P4
,P5
) are weighted differently. The balance weights are[4.0, 1.0, 0.4]
respectively.4.3 Eliminate Grid Sensitivity
In YOLOv2 and YOLOv3, the formula for calculating the predicted target information is:
In YOLOv5, the formula is:
Compare the center point offset before and after scaling. The center point offset range is adjusted from (0, 1) to (-0.5, 1.5).
Therefore, offset can easily get 0 or 1.
Compare the height and width scaling ratio(relative to anchor) before and after adjustment. The original yolo/darknet box equations have a serious flaw. Width and Height are completely unbounded as they are simply out=exp(in), which is dangerous, as it can lead to runaway gradients, instabilities, NaN losses and ultimately a complete loss of training. refer this issue
4.4 Build Targets
Match positive samples:
Environments
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
Status
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
The text was updated successfully, but these errors were encountered: