Error in projection_distribution (Distributional DQN) ? #8

pclucas14 · 2018-04-19T15:44:46Z

Hi,

I have a question regarding the projection_distribution method. It seems that when you are projecting back on the support/bins, at lines :

proj_dist.view(-1).index_add_(0, (l + offset).view(-1), (next_dist * (u.float() - b)).view(-1)) 
proj_dist.view(-1).index_add_(0, (u + offset).view(-1), (next_dist * (b - l.float()) ).view(-1))

the distribution next_dist is scaled by the support from the line
next_dist = target_model(next_state).data.cpu() * support
It seems like this should not be the case. This results in the final projected distribution not summing up to one. It seems one should do something like

next_dist_raw = target_model(next_state).data.cpu()
next_dist = next_dist_raw * support
next_action = next_dist.sum(2).max(1)[1]
next_action = next_action.unsqueeze(1).unsqueeze(1).expand(next_dist.size(0), 1, next_dist.size(2))
next_dist = next_dist.gather(1, next_action).squeeze(1)
next_dist_raw = next_dist_raw.gather(1, next_action).squeeze(1)

proj_dist.view(-1).index_add_(0, (l + offset).view(-1), (next_dist_raw * (u.float() - b)).view(-1))
proj_dist.view(-1).index_add_(0, (u + offset).view(-1), (next_dist_raw * (b - l.float()) ).view(-1))

This results in a distribution that contains the same amount of mass as the original one.

Thank you,
Lucas

The text was updated successfully, but these errors were encountered:

miilue · 2023-04-19T07:00:08Z

I think the same as you. I'm a bit confused about this author's implementation of the Distributional DQN.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in projection_distribution (Distributional DQN) ? #8

Error in projection_distribution (Distributional DQN) ? #8

pclucas14 commented Apr 19, 2018 •

edited

Loading

miilue commented Apr 19, 2023

Error in projection_distribution (Distributional DQN) ? #8

Error in projection_distribution (Distributional DQN) ? #8

Comments

pclucas14 commented Apr 19, 2018 • edited Loading

miilue commented Apr 19, 2023

pclucas14 commented Apr 19, 2018 •

edited

Loading