Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encountered a problem when using the mask prediction module as a preprocessing module for other algorithms #11

Open
blandness1217 opened this issue Jul 27, 2023 · 14 comments

Comments

@blandness1217
Copy link

Hello, thank you very much for providing such outstanding work. I noticed that you once said, 'The mask prediction module in this method can be used as a preprocessing module for other point cloud registration methods, although these registration methods may not be designed for partially overlapping point cloud registration.' You have also successfully combined the mask prediction module with methods such as PointNetLK, DeepGMR, and DCP, and achieved significant results. I have been trying to combine the mask prediction module with a method for fully overlapping point cloud registration for a month, but it has not been successful yet. Therefore, I would like to request that you provide me with the code to combine the mask prediction module with other methods, so that I can learn the specific details of use. My email is [email protected]. Thank you very much!

@hxwork
Copy link
Owner

hxwork commented Jul 27, 2023

Hi,

Thanks for your interest! Unfortunately, we cannot release the code of this part right now, but I may try my best to help you find the problem. Could you please provide more details about your implementation, such as the dataset you used, how you trained the mask prediction module, and which fully overlapping point cloud registration method you used?

@blandness1217
Copy link
Author

Hi, I am very grateful for your quick and enthusiastic response. Let me explain my current situation: Firstly, I am using the Feature-metric Registration (FMR) method for point cloud registration, which is a method for fully overlapping point cloud registration. Secondly, I am using the Modelnet40 dataset, and my strategy for generating partially overlapping point clouds is inspired by the way RPMNet generates partially overlapping point cloud pairs. Then, I have noticed that OMNet consists of four modules: encoder, fusion, decoder, and regression. I have only incorporated the encoder, fusion, and decoder into the training process of FMR. Since I am worry about the accuracy of the pred_mask from the untrained decoder, I first trained the mask prediction module (encoder + fusion + decoder) separately for 20 epochs. During this process, no transformation matrix was generated, so I couldn't change xyz_src_iter at the end of each training iteration as OMNet does. After 20 epochs, I integrated the training of the mask prediction module with the FMR network: the input point clouds p0 and p1 are first passed through the mask prediction module, and then I use the pred_mask output by the decoder of OMNet on them (I thought that this way they would become point clouds of the same shape, and then I replaced the points whose coordinates became (0,0,0) with the points whose coordinates are not (0,0,0)). After that, I fed them into the FMR network for training.

@hxwork
Copy link
Owner

hxwork commented Jul 27, 2023

I see. So here are my suggestions,

  1. You may check the classification accuracy of the mask prediction module. The accuracy should be around 0.9.
  2. The pred_mask should be used after argmax since it is applied as a "hard" mask.
  3. I am not sure if training the mask prediction module with the FMR together is helpful. If the accuracy of the mask is not that bad, you may fix it and only send the masked input to the FMR (maybe already trained on the fully overlapping dataset) to see if you can obtain better results.

@blandness1217
Copy link
Author

Your suggestions are very helpful, I will give them a try. But I still have some questions regarding the third suggestion.
I would like to know if you trained the mask prediction module separately when you used it as a preprocessing module for other algorithms, and then send the masked input to the algorithms after training, or embed the training of the mask prediction module into the training process of other algorithms for training together? Looking forward to your reply

@hxwork
Copy link
Owner

hxwork commented Jul 30, 2023

If I remember correctly, I trained them separately.

@blandness1217
Copy link
Author

Thank you very much for your reply. I have made some attempts in the past two days but have failed.
I attempted to change the output of the regression module in omnet to handle the data I generated myself (partially-overlapping point clouds p0 and p1, 4x4 transformation matrix (p0->p1)). The reason for this is that my ground truth is a 4x4 transformation matrix, while the output of the original regression module is a 7D vector. I encountered difficulties in calculating the Loss function. Therefore, I changed the output of the regression module to 12D vector. The first nine constitute the Rotation matrix, and the last three constitute the translation vector. Then the Rotation matrix and the translation vector are combined to form a 4x4 transformation matrix. After that, I also made some changes to the Loss function (but I don't know whether it is correct or not).
In a word, compared with the original omnet, I made some changes in the regression module and Loss function, and the mask prediction module that I finally trained did not have the ability to predict the mask, which made me believe that something was wrong.
I am now planning to use omnet (no change) and the processed dataset you provided to train the mask prediction module. I hope to leave some good news here soon. If you have any suggestions for me, I am very willing to know.
Thank you again for your patience.

@hxwork
Copy link
Owner

hxwork commented Aug 1, 2023

I see. Actually, it may not be proper to directly regress the rotation matrix since the degree of freedom is three not nine (3*3 rotation matrix). There are some packages that provide easy ways to convert the quaternion to a rotation matrix, such as Pytorch3D. Hope it is helpful to solve your problem.

@blandness1217
Copy link
Author

I'm sorry to bother you again after such a long time, but I do need to ask you a question.
My current practice is to train the mask prediction module separately (using the same network as yours, except that the partially overlapping dataset I use is generated by myself.) At the same time, train the unmodified fmr network separately (using the completely overlapping dataset).
My idea is to add a mask prediction module as a preprocessing module to enable the fmr network to have the ability to register non fully overlapping point clouds.The specific implementation idea is as follows: for non completely overlapping point clouds, the overlap masks are first predicted through the mask prediction module, and then the overlap masks are applied to the point clouds to transform them into point clouds with the same shape. Then, the fmr network is used to predict the transformation matrix between these point clouds with the same shape.
Let's talk about the practical situation: The overlap mask I predicted may not be very accurate because I visualized the p0 and p1 before and after applying the overlap mask. In principle, after applying the mask, their shapes should be the same, but in the visualization results, there are still significant differences between the two.
Additionally, I attempted to output overlap percentage between the src_pred_mask and the real overlapping mask in real-time during the training process(as well as ref_pred_mask).The overlap percentage between the mask and the real overlapping mask remains at 50%~90% after many epochs.At the same time, I did the same thing with OMNet, and the result was that after a small amount of epochs, the overlap rate was very stable and reached over 90%.
I suspect there is a problem with my training data because we are using the same network and only the dataset part has differences. After inspection, I found that my dataset is not fixed, and the point cloud data used in each round of training is real-time processed and generated. That is to say, I have not used exactly the same point cloud data for training. I would like to know if this is the reason for my low overlap percentage during training, and is the training data you used is fixed?
Looking forward to your reply.

@hxwork
Copy link
Owner

hxwork commented Sep 2, 2023

Hi,

If I understand your problem correctly, you can obtain a higher accuracy of the overlapping mask prediction using my code but cannot obtain a normal result when changing to your own code and data. I am not sure about the reason since I do not have your code and data. Also, it may take me a long time to help you check your code.

For your confusion about the data preprocessing during training, it is easy to find the answer from my code. When doing the evaluation, there is a flag named "deterministic" will be turned on and the data will be fixed. Note that this flag is turned off during training, which means the data augmentation is random.

So, I recommend you modify my code to meet your requirements directly. This may reduce the probability of writing bugs.

@blandness1217
Copy link
Author

I'm sorry to bother you again, but I encountered some issues during the visualization process while using your code.
I have not made any changes to your code, but have created a new py file to achieve visualization.
The specific operation is as follows: first, create a model, and then download a val_model_best.pth(from OS_RPMNet_clean) from README, then load this pth file and create val_ds, then read two samples from it and concatenate them (B should ≥ 2, otherwise an error will be reported), and then use this model to process the concatenated data(B=2). Then, obtain endpoints and read the src_cls_pred and ref_cls_pred of the fourth iteration of the first sample, then use argmax to turn them into masks, and execute "data0['points_src_masked'] = torch.mul(src_pred_mask4, data0['points_src_tensor'])" and "data0['points_ref_masked'] = torch.mul(ref_pred_mask4, data0['points_ref_tensor'])", but unfortunately, after applying the mask, the shape of src and ref is not the same.
I want to know where my problem lies. If possible, I would like to request that you provide me with your visualization code. I would like to achieve an effect similar to Figure 5 in your paper. Looking forward to your reply.

@hxwork
Copy link
Owner

hxwork commented Sep 9, 2023

If your mask prediction accuracy is around 90% as you mentioned before, you should obtain similar shapes after applying the masks to the input point clouds. For Fig. 5, we show the predicted mask after applying argmax, and the predicted overlapping and non-overlapping points are shown in red and blue, respectively. As for the error map visualization of the global features, please refer to the description in Sec. 5.1.

@blandness1217
Copy link
Author

I'm sorry to bother you again,I encountered a very confusing problem. During training, I set "transform_type" to "modelnet_os_rpmnet_clean". After training for 1000 epochs, I loaded the model saved during the training process as "self_model_latest.pth". Then, I generated a new set of data using the same processing method as the training set. However, the performance of predicting overlapping regions using the model on this new set of data was very unsatisfactory.

The following is the output at a certain moment during the training process:
Epoch: 667, lr=0.0001 total loss: 1.0918(1.1927): 44%|████████████████████ | 57/131 [00:58<01:14, 1.01s/it]
第0次src预测掩膜和真实掩膜的重叠百分比: 93.22698744769873%
第0次ref预测掩膜和真实掩膜的重叠百分比: 93.192119944212%
第1次src预测掩膜和真实掩膜的重叠百分比: 96.59170153417016%
第1次ref预测掩膜和真实掩膜的重叠百分比: 97.14958158995816%
第2次src预测掩膜和真实掩膜的重叠百分比: 93.91562064156206%
第2次ref预测掩膜和真实掩膜的重叠百分比: 93.93959205020921%
第3次src预测掩膜和真实掩膜的重叠百分比: 92.33568688981869%
第3次ref预测掩膜和真实掩膜的重叠百分比: 91.98265341701534%

Here is the output from a certain test process:
Using custom loading net
第0次src预测掩膜和真实掩膜的重叠百分比: 60.94839609483961%
第0次ref预测掩膜和真实掩膜的重叠百分比: 59.9721059972106%
第1次src预测掩膜和真实掩膜的重叠百分比: 52.30125523012552%
第1次ref预测掩膜和真实掩膜的重叠百分比: 53.97489539748954%
第2次src预测掩膜和真实掩膜的重叠百分比: 41.84100418410041%
第2次ref预测掩膜和真实掩膜的重叠百分比: 40.72524407252441%
第3次src预测掩膜和真实掩膜的重叠百分比: 37.79637377963738%
第3次ref预测掩膜和真实掩膜的重叠百分比: 46.02510460251046%

I can't understand why the performance is good during training, but it is still poor when testing on the training dataset. Another thing is that there is an error when executing "model.load_state_dict(state["state_dict"])". Therefore, I use "Using custom loading net" instead. Could this be the reason for the problem? Looking forward to your reply.

@hxwork
Copy link
Owner

hxwork commented Sep 11, 2023

Hi,

Maybe you should check if the pre-trained weights are loaded successfully.

@blandness1217
Copy link
Author

Hi, your suggestion is correct and it helped me solve the problem.
Before executing "model.load_state_dict(state["state_dict"])", "model = torch.nn.DataParallel(model)" is needed to ensure that the model parameters are loaded correctly.
Thank you very much for your patient assistance. Please keep this issue open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants