-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training end-to-end from scratch #21
Comments
Could you share your steps for traning the model and things we need pay attention to. I have downloaded ILSVRC2017 and I am going to train D&T model. Without RPN proposals provided for ILSVRC2017, what changes should I do in the source code? I found RPN proposals in video_generate_random_minibatch.m and I stopped here. Thank you very much. |
Sure:
The last column is the total frames in the video snippet, and the column to the left of it is the sampled frame number.
I'm still trying to reach >70% val mAP. |
conf: |
As mentioned above, I'm not using the matlab implementation. I'm writing my own python implementation using PyTorch. I'm only using this repo as a guide. I have 2 TitanX GPUs (12 gbs each). I think 2 GB is too small. You probably need at least 8 GB. As for the hyperparameters, your best strategy is to read the source matlab code, but my guess:
|
@Feynman27
and for now, I got mAP on full Imagenet VID validation as below: and i think the main difference between us is the training data and the learning rate schedule,you can use the DET+VID dataset to train your network,and see how that improves your mAP. and I am still trying to reach >70% val mAP. the most important thing is I don't get it why the MATLAB version baseline got 74.2%,any idea? |
Great. Happy to hear someone else is building a Pytorch implementation! I've just started training with the alternating VID+DET heuristic. We'll see if that boosts the performance. |
my result : R-FCN one frame baseline+finetune on DT with 1e-4 lr: 69.7% is got by using alternating sample VID or DET in each iteration,but i did not know how to improve my result further |
and for now,i am trying to use alternating sample from VID or DET in each iteration everywhere,not only in finetune the DT from RFCN as the paper says |
I just trainined the R-FCN single-frame baseline using ImageNet VID+DET and reached a frame mAP on the ImageNet VID validation set of 70.3%. Still several percentage points away from the 74% paper result under the same conditions, but much better than just training on Imagenet VID. My initial lr was 1e-3, decayed every 3 epochs up to 11 epochs. Hopefully, initializing the D (&T loss) network with these weights will squeeze out another percentage point or two. |
sure, in early time of my experiment, I also get RFCN single frame baseline mAP of 70.9%,but there is still a long way to get 74.6%,keep me posted with further experiment result! |
I am doing step 1 of Setup on Ubuntu16.04. While I compile caffe-rfcn downloaded from https://github.com/feichtenhofer/caffe-rfcn, error is: /usr/local/include/boost/system/error_code.hpp:233:21: error: looser throw specifier for ‘virtual const char* boost::system::error_category::std_category::name() const’ ######## cuDNN acceleration switch (uncomment to build with cuDNN). ######## CPU-only switch (uncomment to build without GPU support). ######## uncomment to disable IO dependencies and corresponding data layers ######## uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary) ######## Uncomment if you're using OpenCV 3 ######## To customize your choice of compiler, uncomment and set the following. ######## CUDA directory contains bin/ and lib/ directories that we need. ######## CUDA architecture setting: going with all of them. ######## BLAS choice: ######## Homebrew puts openblas in a directory that is not on the standard search path ######## This is required only if you will compile the matlab interface. ######## NOTE: this is required only if you will compile the python interface. ######## Uncomment to use Python 3 (default is Python 2) ######## We need to be able to find libpythonX.X.so or .dylib. ######## Homebrew installs numpy in a non standard path (keg only) ######## Uncomment to support layers written in Python (will link against Python libs) ######## Whatever else you find you need goes here. ######## If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies ######## Uncomment to use ######## N.B. both build and distribute dirs are cleared on ######## Uncomment for debugging. Does not work on OSX due to BVLC/caffe#171 ######## The ID of the GPU that 'make runtest' will use to run unit tests. ######## enable pretty build (comment to see full commands) And I am using cuda8.0 cudnn4 matlab2014b. |
Hi~can I ask some questions? how to get the rois_disp, bbox_targets_disp and bbox_loss_weights_disp which will be input into the network if I want to train the model in pytorch? thank you for your reply @Feynman27 @cc786537662 |
The roi targets are predicted offsets of the boxes relative to your anchors: see Eq. 2 from this paper. The final bbox target layer should predict refined displacements relative to the initial bbox predictions from above. The bbox loss weights are used to mask out background rois so that only the foreground rois are used in the L1 bbox loss. There's also an "outside" weight (called lambda in the paper above) that weights the L1 bbox loss in the overall multi-task loss. Here's a good pytorch implementation: https://github.com/jwyang/faster-rcnn.pytorch. |
Thank you for your reply~ I still don't understand the displacement part, I think the displacement in matlab code is so difficult to understand. |
Hi~I have known the displacement, thank you very much. |
Hi, sorry about disturb you again. I chaned the tau to 10, the result is soooo bad. but the result with tau equaled to 1 is good. Have you run the code with tau equaled to 10? Please reply to me. the question troubled me so long. Thank you very much for your reply @Feynman27 @cc786537662 @feichtenhofer |
@Feynman27 @cc786537662 @Cris-zj can you share your code, I am trying to apply this code to my own data, but it's hard to me, i have no idea about matlab |
@Feynman27 @Cris-zj Hi, I want to ask you some questions. When I try to run the train scripts, I also cannot get the rois_disp, bbox_targets_disp and bbox_loss_weights_disp in input data. How did you solve this problem? Thank you~ |
@zorrocai Hey....I am working on D&T in matlab.Have you tested the code in matlab? |
@naren142 No, I didn't. I just try to rewrite D&T in pytorch. |
Hi, every one. I have rewritten the D&T architecture in pytorch, but I found that there are so many tricks to achieve the original result in the paper, like data process "Linking tracklets to object tubes", and the pretrained RPN, RFCN and so on. I am very busy at the moment, if anyone has interests in this project, please contact with me. Maybe we can work together. |
@zorrocai hey can you please share your email?? |
@naren142 [email protected] |
Has anyone had any luck training this model end-to-end without external object proposals or pretraining the RFCN network on Imagenet DET? I've been trying to train the D(&T loss) model in pytorch and have only reached a frame mean AP of ~64% on the full imagnet VID validation set. Some implementation notes:
The text was updated successfully, but these errors were encountered: