Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor add_target_to_bin and remove_target_from_bin #994

Closed
lukpueh opened this issue Mar 6, 2020 · 3 comments
Closed

Refactor add_target_to_bin and remove_target_from_bin #994

lukpueh opened this issue Mar 6, 2020 · 3 comments

Comments

@lukpueh
Copy link
Member

lukpueh commented Mar 6, 2020

Description of issue or feature request:
The TUF repository tool provides a functionality to distribute a large number of target files over multiple delegated roles (see delegate_hashed_bins).

To add a target to the corresponding bin, identified by the hash of the target path, or remove one from a bin the methods remove add_target_to_bin and remove_target_from_bin are provided. These methods are wrappers around add_target and remove_target, which don't call those methods directly but via a dispatcher helper _locate_and_update_target_in_bin, which also implements common functionality for the wrappers.

These methods should be refactored for the reasons and as outlined below:

Current behavior:

https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2749-L2756

Expected behavior:

  • simplify architecture to allow adding a custom argument to add_target_to_bin that gets passed through
  • remove redundant code
  • pass hash_prefix_length or number_of_bins instead of inferring it from the metadata (the calling code should know this).
  • make bin search O(1), e.g.:
def find_bin_for_hash(hash_prefix, bin_size):
  prefix_len = len(hash_prefix)
  hash_prefix = int(hash_prefix, 16)
  low = hash_prefix  - (hash_prefix % bin_size)
  high = low + bin_size - 1
  return f"{low:0{prefix_len}x}-{high:0{prefix_len}x}"
@lukpueh
Copy link
Member Author

lukpueh commented Mar 9, 2020

Note, the snippet above assumes that the hash_prefix is correctly left-zero-padded. It might be safer to pass target_hash and number_of_bins, and calculate prefix_length, prefix and bin_size. Then the correct padding wouldn't be required.

@lukpueh
Copy link
Member Author

lukpueh commented Mar 17, 2020

If I'm not mistaken this should be as easy as

def add_target_to_bin(self, target_filepath, number_of_bins):
   path_hash = _get_hash(target_filepath) # TODO: add helper using 'securesystemslib.hash'
   bin_name = _ find_bin_for_hash(path_hash, number_of_bins) # TODO: Slightly modify and add 'find_bin_for_hash' from above
   self._delegated_roles[hashed_bin_name]).add_target(target_filepath)

... (modulo testing, sanitizing inputs, failing gracefully).

And the same should work for remove_target_from_bin and remove_target.

joshuagl added a commit to joshuagl/tuf that referenced this issue Mar 20, 2020
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.

Based sketches by Lukas Puehringer in issues theupdateframework#994 & theupdateframework#995

Signed-off-by: Joshua Lock <[email protected]>
joshuagl added a commit to joshuagl/tuf that referenced this issue Mar 24, 2020
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.

Based sketches by Lukas Puehringer in issues theupdateframework#994 & theupdateframework#995

Signed-off-by: Joshua Lock <[email protected]>
joshuagl added a commit to joshuagl/tuf that referenced this issue Mar 26, 2020
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.

Based sketches by Lukas Puehringer in issues theupdateframework#994 & theupdateframework#995

Signed-off-by: Joshua Lock <[email protected]>
joshuagl added a commit to joshuagl/tuf that referenced this issue Mar 26, 2020
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.

Based sketches by Lukas Puehringer in issues theupdateframework#994 & theupdateframework#995

Signed-off-by: Joshua Lock <[email protected]>
joshuagl added a commit to joshuagl/tuf that referenced this issue Mar 30, 2020
Add a helper function to determine the name of a bin that a hashed
targetfile will be delegated to.

Based sketches by Lukas Puehringer in issues theupdateframework#994 & theupdateframework#995

Signed-off-by: Joshua Lock <[email protected]>
@lukpueh
Copy link
Member Author

lukpueh commented Apr 1, 2020

Fixed in #1007

@lukpueh lukpueh closed this as completed Apr 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant