Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrapping MFAS and 1dsfm python example #535

Merged
merged 12 commits into from
Oct 3, 2020

Conversation

akshay-krishnan
Copy link
Contributor

This PR provides a python wrapper to the MFAS class.
It adds an example that shows how 1dsfm uses MFAS and translation recovery together to estimate global translations from unit translation measurements given global rotations.
There is also a small change to the MFAS API. The unused shared_ptr member has been removed.

Arguments:
measurements {BinaryMeasurementsUnit3}- List of translation direction from the first node to
the second node in the coordinate frame of the first node.
rotations {Values} -- Estimated rotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, Akshay. Which way do the estimated rotations point? wRc, or cRw? Could we clarify that in the docstring?

translations = gtsam.TranslationRecovery(inlier_measurements).run()

poses = gtsam.Values()
for key in rotations.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename key to something more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to rename it, any suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't quite follow this part, what does each key represent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could rename it as wRc?

Copy link
Contributor Author

@akshay-krishnan akshay-krishnan Sep 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key is a type in GTSAM. If I am not wrong, it is equivalent to a size_t (like an unsigned integer). Translation Averaging, like much of other gtsam code, uses Keys to represent Camera IDs. gtsam.Values is used to store the rotations, and it supports lookup by the key of the camera.
I don't think calling it wRc would be correct, since it is not a rotation. Maybe we could call it camera_idx in external code, but doing so here would hide the fact that it is actually a gtsam Key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term keypair used at another place in the code is a pair of keys. The pybind wrapper allows us to treat it like a tuple of ints in python.


# Some hyperparameters.
max_1dsfm_projection_directions = 50
outlier_weight_threshold = 0.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this specified in the paper, or a hyperparameter for the user to tune based on their dataset? might be nice to add a small clarification. perhaps these should be module level constants, like OUTLIER_WEIGHT_THRESHOLD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a translation averaging python module? It is not part of the MFAS module, since its used outside. Should we capitalize the names here?
Yes, the values are from Kyle Wilson's code. He actually uses 48 for max_1dsfm_projection_directions. I will update that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Would we expect a user to choose different hyperparameters than these? if so, these would go into argparse.

From what you mentioned, it seems like these are relatively fixed constants, in which case I would just define these at the top of this file as OUTLIER_WEIGHT_THRESHOLD etc, otherwise they come across as magic numbers

translation_directions = []
for i in range(0, len(poses) - 2):
# Add the rotation
rotations.insert(i, poses[i].rotation())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to clarify what rotations represents: is it wRc or cRw here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have clarified in the comment above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Akshay. I think it would be even more readable if we did it in the code itself. What do you think about the following:

    # Using toy dataset in SfMdata for example.
    wTc_list = SFMdata.createPoses(gtsam.Cal3_S2(50.0, 50.0, 0.0, 50.0, 50.0))
    # Rotations of the cameras in the world frame
    wRc_list = gtsam.Values()
    # Normalized translation directions for pairs of cameras - from camera i to camera j,
    # in the coordinate frame of the first camera.
    jti_directions = []
    for i in range(0, len(poses) - 2):
        # Add the rotation.
        wRc_list.insert(i, wTc_list[i].rotation())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I use wRc_values instead of wRc_list? Because it is not really a list.
Also, I think direction from i to j should be itj_direction since i is the reference frame?
And if this direction is expressed in the world frame, can I call it w_itj_direction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm i'm a bit confused here. The relative direction in the world frame is just temporary (for the purpose of generating actual relative translation directions, in a camera's own frame), right?

Because the E matrix will only give us in one's own frame.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, i’m good with any of your suggestions about the names for it

if avg_outlier_weights[(Z.key1(), Z.key2())] < outlier_weight_threshold]

# Run the optimizer to obtain translations for normalized directions.
w_translations = gtsam.TranslationRecovery(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a bit hesitant about our notations here. My preference is too be overly explicit, like this: https://github.com/argoai/argoverse-api#a-note-regarding-coordinate-transforms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the concern, at the same time, I think the notation that you point to p_dst = dst_SE3_src * p_src might contradict with the GTSAM convention (if I am not wrong). Ideally I think we should prefer dst_p = dst_SE3_src * src_p. Is there a variable name you recommend here? I called them translations to be somewhat consistent in the file (we have relative_translations), but w_T_c would work as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm good with either of those. I just get very confused when I read only rotation or pose. I would much prefer if we use something like the following throughout this whole file:
wTc for SE(3) transformations from camera to world
wtc for translations from camera to world
wRc for rotations from camera to world

for keypair, weight in outlier_weight_dict.items():
avg_outlier_weights[keypair] += weight / len(outlier_weights)

# Remove w_relative_tranlsations that have weight greater than threshold, these are outliers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo on tranlsations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does keypair mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keypair is actually a tuple (i, j), a pair of camera indices. I called it keypair because that is what the underlying C++ structure is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. thanks for the clarification. We still have a typo w_relative_tranlsations, and that variable should be renamed in the comment too

python/gtsam/examples/TranslationAveragingExample.py Outdated Show resolved Hide resolved


def get_data() -> Tuple[gtsam.Values, List[gtsam.BinaryMeasurementUnit3]]:
""""Returns data from SfMData.createPoses(). This contains global rotations and unit translations directions."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i might add a little more here, or in the module docstring, explaining where the data comes from. Are we perturbing known poses with noise, and then trying to recover the ground truth from the noisy rotations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can still improve this docstring -- i don't think it explains all that is happening here in this function

translation_directions = []
for i in range(0, len(poses) - 2):
# Add the rotation
rotations.insert(i, poses[i].rotation())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Akshay. I think it would be even more readable if we did it in the code itself. What do you think about the following:

    # Using toy dataset in SfMdata for example.
    wTc_list = SFMdata.createPoses(gtsam.Cal3_S2(50.0, 50.0, 0.0, 50.0, 50.0))
    # Rotations of the cameras in the world frame
    wRc_list = gtsam.Values()
    # Normalized translation directions for pairs of cameras - from camera i to camera j,
    # in the coordinate frame of the first camera.
    jti_directions = []
    for i in range(0, len(poses) - 2):
        # Add the rotation.
        wRc_list.insert(i, wTc_list[i].rotation())

if avg_outlier_weights[(Z.key1(), Z.key2())] < outlier_weight_threshold]

# Run the optimizer to obtain translations for normalized directions.
w_translations = gtsam.TranslationRecovery(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm good with either of those. I just get very confused when I read only rotation or pose. I would much prefer if we use something like the following throughout this whole file:
wTc for SE(3) transformations from camera to world
wtc for translations from camera to world
wRc for rotations from camera to world

translations = gtsam.TranslationRecovery(inlier_measurements).run()

poses = gtsam.Values()
for key in rotations.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could rename it as wRc?

@akshay-krishnan
Copy link
Contributor Author

akshay-krishnan commented Sep 29, 2020

Thanks @johnwlambert. Can I use wRc_values instead of wRc_list? Because it is not really a list.
Also, I think direction from i to j should be itj_direction since i is the reference frame?
And if this direction is expressed in the world frame, can I call it w_itj_direction?

@johnwlambert
Copy link
Contributor

Much improved with the new variable names, thanks Akshay.

# Create unit translation measurements with next two poses.
for j in range(i + 1, i + 3):
i_iZj = gtsam.Unit3(wTc[i].rotation().unrotate(
wTc[j].translation() - wTc[i].translation()))
Copy link
Contributor

@johnwlambert johnwlambert Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following would be much more readable to me :-)

# compute the ray from pose i to pose j, in the world reference frame
w_iZj = wTc[j].translation() - wTc[i].translation()
wRi = wTc[i].rotation()
iRw = wRi.inverse()
# place the ray into i's coordinate frame
i_iZj = gtsam.Unit3(iRw.rotate(w_iZj))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we could use the unrotate method right? Instead of taking the inverse and then rotating.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, w_iZj in your code is not a ray, if you are defining ray as the unit translation direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrotate works too, i find it a bit less intuitive, but your call

wRc_values: Rotations of the cameras in the world frame.

Returns:
Values: Estimated poses.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. i think this return arg should be named wTc_values, not Values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are variable names mentioned in return args? This is the type not the name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what i understand, the variable name can be listed in return args. but the type does not need to be repeated there, since it's redundant with the return arg type hint in the function signature

@johnwlambert
Copy link
Contributor

Hi Akshay -- looks good. I just left last 4 things, approving now, I trust you'll address them before merge.

Copy link
Member

@dellaert dellaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me, but with pybind we should be able to use native python collections, right @ProfFan ? I’m ok with doing that in another PR

Copy link
Contributor

@ayushbaid ayushbaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

Otherwise, LGTM

@@ -7,6 +7,8 @@

#include <gtsam/sfm/MFAS.h>

#include <boost/shared_ptr.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is shared_ptr being used right now? I see it has been removed from one instance.

#include <CppUnitLite/TestHarness.h>

#include <iostream>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need iostream now?

return wRc_values, i_iZj_list


def prune_to_inliers(w_iZj_list: gtsam.BinaryMeasurementsUnit3) -> gtsam.BinaryMeasurementsUnit3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe prune_ouliers? or filter_outliers?

@akshay-krishnan akshay-krishnan merged commit 627c015 into develop Oct 3, 2020
@akshay-krishnan akshay-krishnan deleted the feature/1dsfm_example branch October 3, 2020 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants