-
Notifications
You must be signed in to change notification settings - Fork 764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrapping MFAS and 1dsfm python example #535
Conversation
Arguments: | ||
measurements {BinaryMeasurementsUnit3}- List of translation direction from the first node to | ||
the second node in the coordinate frame of the first node. | ||
rotations {Values} -- Estimated rotations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, Akshay. Which way do the estimated rotations point? wRc
, or cRw
? Could we clarify that in the docstring?
translations = gtsam.TranslationRecovery(inlier_measurements).run() | ||
|
||
poses = gtsam.Values() | ||
for key in rotations.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename key
to something more descriptive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what to rename it, any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't quite follow this part, what does each key represent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could rename it as wRc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Key is a type in GTSAM. If I am not wrong, it is equivalent to a size_t (like an unsigned integer). Translation Averaging, like much of other gtsam code, uses Keys to represent Camera IDs. gtsam.Values is used to store the rotations, and it supports lookup by the key of the camera.
I don't think calling it wRc would be correct, since it is not a rotation. Maybe we could call it camera_idx in external code, but doing so here would hide the fact that it is actually a gtsam Key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term keypair used at another place in the code is a pair of keys. The pybind wrapper allows us to treat it like a tuple of ints in python.
|
||
# Some hyperparameters. | ||
max_1dsfm_projection_directions = 50 | ||
outlier_weight_threshold = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this specified in the paper, or a hyperparameter for the user to tune based on their dataset? might be nice to add a small clarification. perhaps these should be module level constants, like OUTLIER_WEIGHT_THRESHOLD
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a translation averaging python module? It is not part of the MFAS module, since its used outside. Should we capitalize the names here?
Yes, the values are from Kyle Wilson's code. He actually uses 48 for max_1dsfm_projection_directions. I will update that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Would we expect a user to choose different hyperparameters than these? if so, these would go into argparse.
From what you mentioned, it seems like these are relatively fixed constants, in which case I would just define these at the top of this file as OUTLIER_WEIGHT_THRESHOLD
etc, otherwise they come across as magic numbers
translation_directions = [] | ||
for i in range(0, len(poses) - 2): | ||
# Add the rotation | ||
rotations.insert(i, poses[i].rotation()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to clarify what rotations represents: is it wRc
or cRw
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have clarified in the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Akshay. I think it would be even more readable if we did it in the code itself. What do you think about the following:
# Using toy dataset in SfMdata for example.
wTc_list = SFMdata.createPoses(gtsam.Cal3_S2(50.0, 50.0, 0.0, 50.0, 50.0))
# Rotations of the cameras in the world frame
wRc_list = gtsam.Values()
# Normalized translation directions for pairs of cameras - from camera i to camera j,
# in the coordinate frame of the first camera.
jti_directions = []
for i in range(0, len(poses) - 2):
# Add the rotation.
wRc_list.insert(i, wTc_list[i].rotation())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I use wRc_values
instead of wRc_list
? Because it is not really a list.
Also, I think direction from i to j should be itj_direction
since i is the reference frame?
And if this direction is expressed in the world frame, can I call it w_itj_direction
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm i'm a bit confused here. The relative direction in the world frame is just temporary (for the purpose of generating actual relative translation directions, in a camera's own frame), right?
Because the E matrix will only give us in one's own frame.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if so, i’m good with any of your suggestions about the names for it
if avg_outlier_weights[(Z.key1(), Z.key2())] < outlier_weight_threshold] | ||
|
||
# Run the optimizer to obtain translations for normalized directions. | ||
w_translations = gtsam.TranslationRecovery( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit hesitant about our notations here. My preference is too be overly explicit, like this: https://github.com/argoai/argoverse-api#a-note-regarding-coordinate-transforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the concern, at the same time, I think the notation that you point to p_dst = dst_SE3_src * p_src
might contradict with the GTSAM convention (if I am not wrong). Ideally I think we should prefer dst_p = dst_SE3_src * src_p
. Is there a variable name you recommend here? I called them translations to be somewhat consistent in the file (we have relative_translations), but w_T_c
would work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm good with either of those. I just get very confused when I read only rotation
or pose
. I would much prefer if we use something like the following throughout this whole file:
wTc
for SE(3) transformations from camera to world
wtc
for translations from camera to world
wRc
for rotations from camera to world
for keypair, weight in outlier_weight_dict.items(): | ||
avg_outlier_weights[keypair] += weight / len(outlier_weights) | ||
|
||
# Remove w_relative_tranlsations that have weight greater than threshold, these are outliers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo on tranlsations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does keypair mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keypair is actually a tuple (i, j), a pair of camera indices. I called it keypair because that is what the underlying C++ structure is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. thanks for the clarification. We still have a typo w_relative_tranlsations
, and that variable should be renamed in the comment too
|
||
|
||
def get_data() -> Tuple[gtsam.Values, List[gtsam.BinaryMeasurementUnit3]]: | ||
""""Returns data from SfMData.createPoses(). This contains global rotations and unit translations directions.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i might add a little more here, or in the module docstring, explaining where the data comes from. Are we perturbing known poses with noise, and then trying to recover the ground truth from the noisy rotations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can still improve this docstring -- i don't think it explains all that is happening here in this function
translation_directions = [] | ||
for i in range(0, len(poses) - 2): | ||
# Add the rotation | ||
rotations.insert(i, poses[i].rotation()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Akshay. I think it would be even more readable if we did it in the code itself. What do you think about the following:
# Using toy dataset in SfMdata for example.
wTc_list = SFMdata.createPoses(gtsam.Cal3_S2(50.0, 50.0, 0.0, 50.0, 50.0))
# Rotations of the cameras in the world frame
wRc_list = gtsam.Values()
# Normalized translation directions for pairs of cameras - from camera i to camera j,
# in the coordinate frame of the first camera.
jti_directions = []
for i in range(0, len(poses) - 2):
# Add the rotation.
wRc_list.insert(i, wTc_list[i].rotation())
if avg_outlier_weights[(Z.key1(), Z.key2())] < outlier_weight_threshold] | ||
|
||
# Run the optimizer to obtain translations for normalized directions. | ||
w_translations = gtsam.TranslationRecovery( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm good with either of those. I just get very confused when I read only rotation
or pose
. I would much prefer if we use something like the following throughout this whole file:
wTc
for SE(3) transformations from camera to world
wtc
for translations from camera to world
wRc
for rotations from camera to world
translations = gtsam.TranslationRecovery(inlier_measurements).run() | ||
|
||
poses = gtsam.Values() | ||
for key in rotations.keys(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we could rename it as wRc
?
Thanks @johnwlambert. Can I use |
Much improved with the new variable names, thanks Akshay. |
# Create unit translation measurements with next two poses. | ||
for j in range(i + 1, i + 3): | ||
i_iZj = gtsam.Unit3(wTc[i].rotation().unrotate( | ||
wTc[j].translation() - wTc[i].translation())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following would be much more readable to me :-)
# compute the ray from pose i to pose j, in the world reference frame
w_iZj = wTc[j].translation() - wTc[i].translation()
wRi = wTc[i].rotation()
iRw = wRi.inverse()
# place the ray into i's coordinate frame
i_iZj = gtsam.Unit3(iRw.rotate(w_iZj))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we could use the unrotate
method right? Instead of taking the inverse and then rotating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, w_iZj in your code is not a ray, if you are defining ray as the unit translation direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrotate works too, i find it a bit less intuitive, but your call
wRc_values: Rotations of the cameras in the world frame. | ||
|
||
Returns: | ||
Values: Estimated poses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. i think this return arg should be named wTc_values
, not Values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are variable names mentioned in return args? This is the type not the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from what i understand, the variable name can be listed in return args. but the type does not need to be repeated there, since it's redundant with the return arg type hint in the function signature
Hi Akshay -- looks good. I just left last 4 things, approving now, I trust you'll address them before merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK to me, but with pybind we should be able to use native python collections, right @ProfFan ? I’m ok with doing that in another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments.
Otherwise, LGTM
gtsam/sfm/MFAS.cpp
Outdated
@@ -7,6 +7,8 @@ | |||
|
|||
#include <gtsam/sfm/MFAS.h> | |||
|
|||
#include <boost/shared_ptr.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is shared_ptr being used right now? I see it has been removed from one instance.
gtsam/sfm/tests/testMFAS.cpp
Outdated
#include <CppUnitLite/TestHarness.h> | ||
|
||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need iostream now?
return wRc_values, i_iZj_list | ||
|
||
|
||
def prune_to_inliers(w_iZj_list: gtsam.BinaryMeasurementsUnit3) -> gtsam.BinaryMeasurementsUnit3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe prune_ouliers? or filter_outliers?
This PR provides a python wrapper to the MFAS class.
It adds an example that shows how 1dsfm uses MFAS and translation recovery together to estimate global translations from unit translation measurements given global rotations.
There is also a small change to the MFAS API. The unused shared_ptr member has been removed.