Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training data for NYU dataset #12

Open
fanglinpu opened this issue Aug 3, 2018 · 6 comments
Open

Training data for NYU dataset #12

fanglinpu opened this issue Aug 3, 2018 · 6 comments

Comments

@fanglinpu
Copy link

I find that you use all three views of depth images for training in the \denseReg-master\data\nyu.py .
def loadAnnotation(self, is_trun=False):
'''is_trun:
True: to load 14 joints from self.keep_list
False: to load all joints
'''
t1 = time.time()
path = os.path.join(self.src_dir, 'joint_data.mat')
mat = sio.loadmat(path)
camera_num = 1 if self.subset=='testing' else 3
joints = [mat['joint_xyz'][idx] for idx in range(camera_num)]
names = [['depth_{}_{:07d}.png'.format(camera_idx+1, idx+1) for idx in range(len(joints[camera_idx]))] for camera_idx in range(camera_num)]

But for fair comparison only view 1 images should be used for training.

@melonwan
Copy link
Owner

melonwan commented Aug 3, 2018

thanks a lot for pointing this out. Actually we've only used first view for training. See as another part of the code in dataset.py/nyu line 64, where only first 1/3 data are fed. I leave this interface available in case for ease of other usage.

@fanglinpu
Copy link
Author

Thank you for your answer.

@fanglinpu
Copy link
Author

fanglinpu commented Aug 3, 2018

I am a little bit confused about the depth normalization processing, why is the hand depth values range from com[2]-D_RANDGE to com[2]+D_RANGE*0.5? The corresponding code is as follows:

def norm_dm(dms, coms):
    def fn(elems):
        dm, com = elems[0], elems[1]
        max_depth = com[2]+D_RANGE*0.5
        min_depth = com[2]-D_RANGE*0.5
        mask = tf.logical_and(tf.less(dm, max_depth), tf.greater(dm, min_depth-D_RANGE*0.5))
        normed_dm = tf.where(mask, tf.divide(dm-min_depth, D_RANGE), -1.0*tf.ones_like(dm))
        return [normed_dm, com]

    norm_dms, _ = tf.map_fn(fn, [dms, coms])

    return norm_dms

I think the hand depth values should range from com[2]-D_RANGE0.5 to com[2]+D_RANGE0.5. But the provided code is as follows:
mask = tf.logical_and(tf.less(dm, max_depth), tf.greater(dm, min_depth-D_RANGE*0.5))

@melonwan
Copy link
Owner

melonwan commented Aug 5, 2018

these are just some trial and error hacky stuffs.

@fanglinpu
Copy link
Author

For ICVL and MSRA datasets, the cropped image from testing set is also obtained by exploiting the ground truth pose, I think it is inappropriate.

@melonwan
Copy link
Owner

melonwan commented Aug 7, 2018

msra provides bbx as starting point. It is very easy to crop out hand from icvl with heuristics, eg depth thresholding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants