You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently training on the PoseAug model and have encountered an issue regarding the expected accuracy improvement. My base model yields a Mean Per Joint Position Error (MPJPE) of 31.97 mm, while the PoseAug model is producing an MPJPE of 30.71 mm. However, the repository mentions an expected improvement of at least 4 mm.
Input Details:
I am using videos from 100 different people walking in various styles, and also videos consist of the same person's with different walks (normal walk, fast walk, forward and backward in both directions). PoseAug Input:
ViTPose 2D output and 3D ground truth values. Configuration:
I have kept other configurations as they are in the repository.
The only changes I made in the code involve modifying the data loader and adding an epsilon value wherever division occurs.
Questions:
Could you provide any insights on why the model is not converging as expected?
Are there specific adjustments or additional data types that might help improve the accuracy further?
Thank you for your help!
The text was updated successfully, but these errors were encountered:
Hi thank you for the interest, as you mentioned "videos consist of the same person", you may try train a model focus on this only person, this will improve the acc.
You can ref to "data_extra/bone_length_npy/hm36s15678_bl_templates.npy", change all the 3D/2D data and augmented data with your person's bone length.
I am currently training on the PoseAug model and have encountered an issue regarding the expected accuracy improvement. My base model yields a Mean Per Joint Position Error (MPJPE) of 31.97 mm, while the PoseAug model is producing an MPJPE of 30.71 mm. However, the repository mentions an expected improvement of at least 4 mm.
Input Details:
I am using videos from 100 different people walking in various styles, and also videos consist of the same person's with different walks (normal walk, fast walk, forward and backward in both directions).
PoseAug Input:
ViTPose 2D output and 3D ground truth values.
Configuration:
I have kept other configurations as they are in the repository.
The only changes I made in the code involve modifying the data loader and adding an epsilon value wherever division occurs.
Questions:
Could you provide any insights on why the model is not converging as expected?
Are there specific adjustments or additional data types that might help improve the accuracy further?
Thank you for your help!
The text was updated successfully, but these errors were encountered: