Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Q]wrong steps appears in all charts #2444

Closed
besbesmany opened this issue Jul 24, 2021 · 22 comments
Closed

[Q]wrong steps appears in all charts #2444

besbesmany opened this issue Jul 24, 2021 · 22 comments

Comments

@besbesmany
Copy link

besbesmany commented Jul 24, 2021

I have augmented dataset from roboflow
when i run train.py
the steps (epochs) suddenly doubled and the chart instantly and increase accuracy
the yellow line runs now at 190 epoch
yesterday orange line runs nearly to 220

but they gives double step instantly

this is my train line
!python train.py --img 416 --batch 16 --epochs 500 --data aerialRobo.yaml
--weights yolov5x.pt --cache --project train/ --name AugOriginalAerial416
--bbox_interval 30 --save_period 30 --device 0 --upload_dataset

what should I do ?

W B Chart 7_24_2021, 10_36_03 PM

@besbesmany
Copy link
Author

this the same figure in wamdb and tensorboard
the wamdb have double steps
there is a problem in wamdb chart how to solve it

1
W B Chart 7_25_2021, 7_16_51 AM

@vanpelt
Copy link
Contributor

vanpelt commented Jul 25, 2021

@besbesmany the default "step" with wandb increments every time wandb.log is called. You can log other steps alongside metrics by adding them to the wandb.log calls. It looks like we currently aren't logging an additional steps in our default yolo implementation. I'll sync with our team about adding this to a future yolo release.

@besbesmany
Copy link
Author

can you give me example of code to do that I really don't understand how to make that
this is a yolo code , how can I make wandb.log steps
https://colab.research.google.com/drive/1AQ64y-DV51aXRpowlw79bXanhmsYEz59?usp=sharing

@besbesmany
Copy link
Author

can you please give me colab file that contains how to call and modify wandb.log(...) ,wandb.init(project="YOUR_PROJECT"
with yolov5 , I couldn't find any full example with yolo

I still didn't use full features of wandb
please help me

@vanpelt
Copy link
Contributor

vanpelt commented Jul 26, 2021

@besbesmany you won't be able to modify the way steps are logged with the default yolov5 implementation. We'll need to release a new version of the yolov5 integration to support this.

@besbesmany
Copy link
Author

when this version will be released?

@besbesmany
Copy link
Author

can I show same charts with steps divided by 2, so I can get the correct chart

@vanpelt
Copy link
Contributor

vanpelt commented Jul 27, 2021

Yes, but it's not clear to me why it's important. Can you explain to me why the actual value of the step is important to you? I would imagine if all runs were using the same step simply being able to compare them is the important thing.

You can click the "edit" icon (pencil) when you mouse over a chart. Then select the "Expressions" tab. From there you can add an X-Axis expression as shown below:

image

@besbesmany
Copy link
Author

Thank you sir for your time with me, I really appreciate this
I run several experiments with yolo5 but the error appears only in the last 4 runs

my epochs was 500 , I have nearly 9 experiments
the last 4 has that error (double steps)
so

can I do step/2 for certain charts only?
also I can't change expression to _step/2 it is not active to write,
what should I do now

1

@vanpelt
Copy link
Contributor

vanpelt commented Jul 28, 2021

Unfortunately there isn't a way to make the expression work for only 4 of the runs. You'll need to run them again if you want them to be the same as the others.

To change the x-axis for a specific chart just modify the X Axis Expression. In your screenshot you clicked "Add an expression", that's not what you want.

image

@ghost
Copy link

ghost commented Jul 29, 2021

I'm facing the same problem.

When I ran the same model for several times with some minor changes (I'm believe it wouldn't affect the log step), the charts will have some step difference...

@vanpelt
Copy link
Contributor

vanpelt commented Jul 30, 2021

@Sun-Rider every time you run wandb.log() we increment the "_step" counter. If you don't want to increment the "_step" counter you can add the argument commit=False to your wandb.log() and a subsequent call to wandb.log() without the commit argument will increment the step. If you want full control of your steps you can log them as seperate metrics to wandb.log. I.E. wandb.log({"acc": acc, "my_step": i}). Then in the UI you can choose "my_step" as the x-axis for the "acc" metric.

@AyushExel
Copy link
Contributor

@Sun-Rider @besbesmany This is not intended and is definitely a bug. This was probably introduced by a revamp of logger API that went out a few days ago which unifies wandb and tb logger. Earlier wandb.log was called only once per epoch so your steps should always be equal to total epochs. I have filed an issue on YOLOv5 so I don't forget to fix this.
ultralytics/yolov5#4248

Thank you for reporting this.Feel free to tag me in any other such issues :)

@besbesmany
Copy link
Author

@AyushExel when this will be solved If you can sir?
Thanks for your time with me

@AyushExel
Copy link
Contributor

AyushExel commented Jul 31, 2021

@besbesmany we're working on multiple features right now, so you can expect this to be fixed sometime next week. The doubling of step count won't mess with the accuracy of your metrics. I'll let you know once the fix is merged. All you'll need to do is run git pull in the yolov5/ directory.

I'm working on revamping the README for W&B integration. I'll put up a PR for it soon but if you want a short reference of things you can do with yolo ,W&B artifacts and tables, I've made this doc public -> https://wandbai.notion.site/YoloV5-W-B-ec06daa32df44b6bb918d26759fdbb8f

It contains all the advanced features of the integration and some FAQs. Thanks for your patience!

@besbesmany
Copy link
Author

@AyushExel thank you sir
please add these features too because I get alot of errors

1- get max value of chart train.py
wandb.run.define_metric("metrics/mAP_0.5", summary="max") because it gives me an error
Traceback (most recent call last):
File "train.py", line 47, in
from utils.wandb_logging.wandb_utils import WandbLogger, check_wandb_resume
ModuleNotFoundError: No module named 'utils.wandb_logging'

2- also get Percision-recall curve
3- train data accuracy

@AyushExel
Copy link
Contributor

AyushExel commented Jul 31, 2021

@besbesmany wait, you're getting the error in the existing train script? I'm talking about point 1.
2. will be a bit of work. I'll have to ask the maintainer if he's open to doing something like that.
3. What do you mean by data accuracy exactly? Are there some metrics that we can compute? If so, please share some papers/implementations that we can use for reference
EDIT: Are you talking about something like this -> https://github.com/dbolya/tide ?

@besbesmany
Copy link
Author

the first point Mr. @vanpelt (thanks for him alot) gives me the code to add in train.py to get max mAP it works great for 3 runs then it gives me that error (No module named 'utils.wandb_logging' ), I found this folder inside yolov5 folder but the colab couldn't see it , I don't know why ...
so if you can make max and average for mAp 0.5 , 0.5:0.95 available by default in charts

2- I need pr_curve , F-score if you can to compare several runs together

3- I want train data accuracy chart , yolov5 have only (matrix/mAP0.5) for validation data, I would like to compare training and validation data together if you can

Thanks a lot for your interest for my needs

@AyushExel
Copy link
Contributor

AyushExel commented Jul 31, 2021

@besbesmany Okay. 1. should be simple enough. I'll do it.
2. same as before, I'll need to ask Glenn
3. This isn't optimal. Think of how much computation and time will be required if you calculate these metrics on a train split of large datasets like coco. I agree that train metrics do provide some info but choosing the best model is generally based on a validation dataset. You may raise a feature request on YOLOv5 github for adding optional calculation of train split metrics as well.

@AyushExel
Copy link
Contributor

AyushExel commented Aug 1, 2021

@besbesmany Your original issue of step counts being doubled is fixed. See the latest run that I did for 3 epochs https://wandb.ai/cayush/yoloV5/runs/2fbkc326?workspace=user-cayush.
There are only 3 steps. Feel free to close this issue

@besbesmany
Copy link
Author

thank you sir

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2021

This issue is stale because it has been open 60 days with no activity.

@github-actions github-actions bot added the stale label Oct 1, 2021
@sydholl sydholl closed this as completed Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants