Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: not enough values to unpack (expected7, got6) #2

Open
WUHE-art opened this issue Jul 19, 2021 · 11 comments
Open

ValueError: not enough values to unpack (expected7, got6) #2

WUHE-art opened this issue Jul 19, 2021 · 11 comments

Comments

@WUHE-art
Copy link

Hello, when I run python coco_scripts/train.py-- exp_name captioning_model-- batch_size 100-- lr 5e-4, the following questions arise: ValueError: not enough values to unpack (expected7, got6) have you ever encountered such a problem? Looking forward to your reply. Thanks.

@mad-red
Copy link
Owner

mad-red commented Jul 20, 2021

change "from speaksee.data import ImageDetectionsField" into "from data import ImageDetectionsField"

@WUHE-art
Copy link
Author

Hello, the error didn't disappear when I modified "from speaksee.data import TextField,ImageDetectionsField", but when I changed "for it, (detections, _, ctrl_det_seqs, ctrl_det_gts, ctrl_det_seqs_test, _, caps_gt)" to "for it, (detections,ctrl_det_seqs, ctrl_det_gts, ctrl_det_seqs_test, _, caps_gt)"", the program worked fine. I don't know why.

@mad-red
Copy link
Owner

mad-red commented Jul 20, 2021

the deleted "_" stands for imgids, and the training doesn't need it

@WUHE-art
Copy link
Author

Hello, I have tried to reproduce your code, and all parameters used have not been modified, but the cross-entropy loss function of training is only about 170, which is far from your paper. Looking forward to your reply. Thanks.

@mad-red
Copy link
Owner

mad-red commented Jul 27, 2021

Do you mean the score of the whole framework? Or only validation score during the training procedure of the captioning module?

@WUHE-art
Copy link
Author

The final score for the entire framework is very low and far from your paper score.

@mad-red
Copy link
Owner

mad-red commented Jul 28, 2021

Sorry for the problem you meet, I'll retrain the model using the uploaded codes to find why and fix it.

@WUHE-art
Copy link
Author

WUHE-art commented Aug 3, 2021

Hello, I used the 2080s training model for a total of 13 days, and here are the results of the final test. It doesn't look good when I run it. I don't know if it's the graphics card.
2021-08-03 10-19-41屏幕截图
2021-08-03 10-20-01屏幕截图
2021-08-03 10-20-21屏幕截图
2021-08-03 10-20-39屏幕截图

@mad-red
Copy link
Owner

mad-red commented Aug 3, 2021

It seems strange. Are you using the captioning model after Reinforcement Learning?

@WUHE-art
Copy link
Author

Hello, I am very interested in your article, and recently I read the code of the paper. There were a few problems in the code that I hadn't thought through.
In the train.py file, in line 99 of code, detections stands for the object characteristics detected by Faster R-CNN, and ctrl_det_seqs stands for the ordered set of regions. I don't know if I understand this correctly.
Besides, I don't know what ctrl_det_gts stands for. And the meaning represented by ctrl_det_seqs_test and caps_gt in 148 lines of code.
Thank you very much

@mad-red
Copy link
Owner

mad-red commented Aug 25, 2021

Hi, thanks for your interest.
ctrl_det_gts is a sequence of 0 and 1, 1 means the end of describing a region and begin of next regions in next time step, 0 means not end.
Compared to ctrl_det_seqs, ctrl_det_seqs_test has a process of deduplication of regions with same class. If we only use detections of Faster-RCNN to generate captions with the captioning module, we don't know how many times a region with same class appears in the caption. So the deduplication is used, when only testing the caption module's performance.
caps_gt is the ground-truth captions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants