Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ngclient rollback improvements #1524

Merged
merged 4 commits into from
Aug 24, 2021

Conversation

jku
Copy link
Member

@jku jku commented Aug 18, 2021

Fixes #1498

Improve rollback protection in ngclient

Spec does not explicitly say so but the intent is that "local" timestamp and snapshot metadata can be trusted for rollback checks of new versions of the same metadata even if the local metadata is expired. This is similar to root (but in that case the intent is explicitly explained in the spec).

Likewise intent is that "local" snapshot can be trusted for rollback checks of new versions of the snapshot even if local snapshot does not match the meta version in current timestamp.

Implement the above by moving the expiry and meta version checks to the "next" metadata update: check timestamp expiry just before snapshot is updated, and check snapshot expiry and meta version just before targets is updated. This means that when e.g. a second timestamp (that was downloaded from remote) is loaded, the current (first) timestamp is used for rollback checks even if it is expired -- but when snapshot is loaded the expiry of the current, final timestamp is checked.

Some additional details -- note especially the two divergences from the spec letter (but IMO not intent):

  • the expiry/meta checks are now done every time the "next" metadata updates: e.g. snapshot expiry and meta version are checked every time a targets metadata is updated -- it would be fine to do it only once (when initial "targets" is first updated) but this is simple to implement and the cost is likely minimal
  • metadata does get written to disk before we know if it fulfills the "complete" requirements for expiry and meta version: this is not what the spec says but seems correct to me as the data is "trusted" (signed by threshold of correct keys) and spec documents similar process for root. Even if the final metadata really is expired and the update fails, if the remote later starts serving non-expired metadata, the client will succeed in a later update (and will correctly use the earlier expired metadata for rollback checks!)
  • As Martin points out in comments, the "next" metadata may in some cases be downloaded before the final checks are done (so a snapshot file may be downloaded, but not parsed, before final timestamp expiry is checked): this does not have security implications but is not the order spec demands
  • This does not solve the issue with snapshot hashes/length in timestamp meta: it seems to require TrustedMetadataSet API change so I left it in separate issue: ngclient: review snapshot hashes/length check #1523

@jku jku force-pushed the ngclient-rollback-improvements branch 2 times, most recently from 90d95af to bb86000 Compare August 18, 2021 12:56
Jussi Kukkonen added 4 commits August 19, 2021 11:09
The usefulness was debatable to begin with, and now that it has become
clear that rollback protection requires a second "final verification"
step for all three root, timestamp and snapshot it is clear that
root_update_finished() is not good design.

update_root() still accepts expired root metadata but now the final
root expiry is checked when the "next" metadata (timestamp) is loaded.

Signed-off-by: Jussi Kukkonen <[email protected]>
While this is not explicitly said in the spec, the intention is that
expired timestamp and snapshot should be used for rollback protection
checks on newer timestamp/snapshot (but not for anything else).

Move the expiry checks to the "next" metadata update: timestamp expiry
is checked when snapshot is loaded, and snapshot expiry is checked
when targets is loaded.

Signed-off-by: Jussi Kukkonen <[email protected]>
Spec does not explicitly say so but the intent is that a snapshot
metadata can be trusted for rollback protection checks of newer
snapshots even if current snapshot version does not match the version
in current timestamp meta.

Only do the snapshot version check for the "final" snapshot by doing it
when targets is updated.

Improve test names and comments.

Signed-off-by: Jussi Kukkonen <[email protected]>
Doing the work inline is less code and easier to understand.

Signed-off-by: Jussi Kukkonen <[email protected]>
@jku jku force-pushed the ngclient-rollback-improvements branch from bb86000 to c8696d9 Compare August 19, 2021 08:23
@jku
Copy link
Member Author

jku commented Aug 19, 2021

I'm pretty happy with the result. One disadvantage is that a call to update_snapshot() can lead to "Timestamp is expired" error: that is a bit surprising but it is documented and I think beats the alternative (requiring the caller to explicitly decide when root/timestamp/snapshot is "final" and should be checked for expiry etc).

I may have gone a bit overboard with the code comments but this seems like a tricky situation so comments seemed useful: if someone manages to shorten them to one-liners, I'll be happy to edit...

@jku jku marked this pull request as ready for review August 19, 2021 08:35
Copy link
Collaborator

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec does not explicitly say so but the intent is that "local" timestamp and snapshot metadata can be trusted for rollback checks of new versions of the same metadata even if the local metadata is expired.

The above sounds so logical and I am wondering how we haven't thought about it before.

I have a couple of questions and I will number them for easier cross-reference.

Likewise intent is that "local" snapshot can be trusted for rollback checks of new versions of the snapshot even if local snapshot does not match the meta version in the current timestamp.

That is true because timestamp could have been updated and as a consequence its timestamp.meta["snapshot"].version could be bumped right?

By checking only the last timestamp/snapshot for expiration isn't there a risk of an attacker sending an intermediate expired version of timestamp/snapshot?

What I liked in the initial implementation was that it closely mapped to the client workflow defined in the specification and now some steps are swapped: for example step 5.5.6 will happen between steps 5.6.1 and 5.6.2.
I imagine this could be confusing (even though you added good explanatory comments).
This is something that should be addressed in the specification.
From what I understood, @joshuagl will file an issue about this?

I am even wondering won't it be better if we fix the problem in the specification first and then work on this pr?
Then you will be able to reference the specific step like client workflow 5.6.6. and quote the spec as an argument why the expiration checks are moved.

# client workflow 5.3.10: Make sure final root is not expired.
if self.root.signed.is_expired(self.reference_time):
raise exceptions.ExpiredMetadataError("Final root.json is expired")
# No need to check for 5.3.11 (fast forward attack recovery):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me think: did we do that check before?

Copy link
Member Author

@jku jku Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current ngclient code has a similar comment, and does not do the check. the legacy updater does do this check because it may have untrusted metadata in memory

@@ -250,6 +258,11 @@ def update_snapshot(self, data: bytes):
raise RuntimeError("Cannot update snapshot after targets")
logger.debug("Updating snapshot")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# client workflow 5.4.4: Make sure the final timestamp is not expired.

Maybe it will be even clearer if you add the step number of the client workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would but unfortunately it also requires keeping these numbers updated with spec changes... so I'm more inclined to remove the step numbers I did use in the PR then I am to add more of them.

@@ -336,6 +349,11 @@ def update_delegated_targets(
if self.snapshot is None:
raise RuntimeError("Cannot load targets before snapshot")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# client workflow 5.5.4 and 5.5.6

Maybe it will be even clearer if we add the steps numbers of the client workflow?

@jku
Copy link
Member Author

jku commented Aug 20, 2021

Sorry, I accidentally edited your comment while trying to answer it: I tried to revert but github does not make it easy.

Likewise intent is that "local" snapshot can be trusted for rollback checks of new versions of the snapshot even if local snapshot does not match the meta version in the current timestamp.

That is true because timestamp could have been updated and as a consequence its timestamp.meta["snapshot"].version could be bumped right?

yes, in fact that's required: if you update snapshot the only way to make that snapshot part of the repo for the clients is to publish a new timestamp with updated meta.

By checking only the last timestamp/snapshot for expiration isn't there a risk of an attacker sending an intermediate expired version of timestamp/snapshot?

for both of these only two versions are potentially loaded:

  • local version may be loaded (but a local version is not required to exist)
  • the remote version may be loaded (spec includes always downloading both timestamp and snapshot but especially for snapshot that seems like a specification bug: local version may be the exact version we need, there's no need to load in that case)

One of these is the final version that ends up getting expiry checked -- and in any case even if a buggy client somehow loaded more expired snapshots during the process that would be fine: no security issues that I can see.

What I liked in the initial implementation was that it closely mapped to the client workflow defined in the specification and now some steps are swapped: for example step 5.5.6 will happen between steps 5.6.1 and 5.6.2.

This is a good point! the expiry (and meta version) checks currently jump over two spec items:

  • writing local metadata to disk (this I covered in my PR cover letter)
  • downloading "next" metadata (5.6.1 in your example)

I believe the first jump is correct and this seems to be what legacy updater does as well. The second item is an implementation detail and could be avoided by having snapshot_is_final() type API... but I dislike that because it requires more work from the caller.

I am even wondering won't it be better if we fix the problem in the specification first and then work on this pr?

I don't really want to wait for the spec:

  • ngclient is currently lacking a feature that legacy updater has (the rollback protection) and I consider that a bug
  • the spec changes are likely to be nontrivial: I think it requires defining how local metadata is loaded as well as this tricky interaction between expiry / meta version and rollback checks
  • we keep finding new things (or at least I do!) by implementing this: I believe the spec tasks are getting better defined because this PR is progressing

@jku
Copy link
Member Author

jku commented Aug 20, 2021

Spec does not explicitly say so but the intent is that "local" timestamp and snapshot metadata can be trusted for rollback checks of new versions of the same metadata even if the local metadata is expired.

The above sounds so logical and I am wondering how we haven't thought about it before.

This is a combo of two things:

  • spec is almost completely silent on when and how to load local metadata (it just expects there to be "trusted metadata" already at some point in the workflow without explaining where it came from)
  • it's hard to figure out which of the rollback protections are actually in use in legacy updater (I'm unsure about the expiry cases for local metadata): in any case they load all local metadata in the beginning, and only unload if keys are rotated -- this let's the implementation just forget about issues like "is our currently loaded snapshot actually the snapshot that our new timestamp considers correct", it just assumes this potential problem will be fixed sometime later in the process.

@MVrachev
Copy link
Collaborator

By checking only the last timestamp/snapshot for expiration isn't there a risk of an attacker sending an intermediate expired version of timestamp/snapshot?

for both of these only two versions are potentially loaded:

  • local version may be loaded (but a local version is not required to exist)
  • the remote version may be loaded (spec includes always downloading both timestamp and snapshot but especially for snapshot that seems like a specification bug: local version may be the exact version we need, there's no need to load in that case)

One of these is the final version that ends up getting expiry checked -- and in any case even if a buggy client somehow loaded more expired snapshots during the process that would be fine: no security issues that I can see.

I think I understand what you mean.
Even if a user is fed by an attacker with expired timestamp/snapshot data when calling update_timestamp/update_snapshot that won't be a security concern, because the next time when update_snapshot (for timestamp) or update_targets (for snapshot) is called an expiry check will be done.

I am even wondering won't it be better if we fix the problem in the specification first and then work on this pr?

I don't really want to wait for the spec:

  • ngclient is currently lacking a feature that legacy updater has (the rollback protection) and I consider that a bug

Do the legacy updater checks this? From your next comment:

it's hard to figure out which of the rollback protections are actually in use in legacy updater (I'm unsure about the expiry cases for local metadata): in any case they load all local metadata in the beginning, and only unload if keys are rotated -- this let's the implementation just forget about issues like "is our currently loaded snapshot actually the snapshot that our new timestamp considers correct", it just assumes this potential problem will be fixed sometime later in the process.

seems like the old updater doesn't take care of this case.

  • the spec changes are likely to be nontrivial: I think it requires defining how local metadata is loaded as well as this tricky interaction between expiry / meta version and rollback checks
  • we keep finding new things (or at least I do!) by implementing this: I believe the spec tasks are getting better defined because this PR is progressing

I agree that this pr helps to define what should be changed in the spec.
I am just worried that this moves steps from the update process which is basically the heart of TUF and I don't want us to be in a situation when we are not following the spec update process for a while.
Let's at least make sure we have an issue and discussion going at the spec repo and we are moving in that direction, can we?

@jku
Copy link
Member Author

jku commented Aug 21, 2021

seems like the old updater doesn't take care of this case.

I did not say this. I said it's hard to figure out if it does the checks in all cases (because the checks are done in places that are rather unobvious at least to me -- it's difficult to be sure that all possible scenarios are actually covered... but they may well be).

The legacy updater definitely has rollback protections in place that ngclient does not have, this is not in question.

@jku
Copy link
Member Author

jku commented Aug 23, 2021

I am just worried that this moves steps from the update process which is basically the heart of TUF and I don't want us to be in a situation when we are not following the spec update process for a while.
Let's at least make sure we have an issue and discussion going at the spec repo and we are moving in that direction, can we?

This should cover it maybe theupdateframework/specification#179

I think fixing the spec might be complex enough (difficult to get right, and also difficult to negotiate WRT existing implementation decisions) that I don't like to just wait for the spec to be updated... we should do our best effort to implement what we think is the spec intent.

@jku jku requested review from joshuagl and sechkova August 23, 2021 13:05
@MVrachev
Copy link
Collaborator

I am just worried that this moves steps from the update process which is basically the heart of TUF and I don't want us to be in a situation when we are not following the spec update process for a while.
Let's at least make sure we have an issue and discussion going at the spec repo and we are moving in that direction, can we?

This should cover it maybe theupdateframework/specification#179

Yes, it does.

I think fixing the spec might be complex enough (difficult to get right, and also difficult to negotiate WRT existing implementation decisions) that I don't like to just wait for the spec to be updated... we should do our best effort to implement what we think is the spec intent.

Now that at least we have documented what you have found in the spec repo and hopefully discussion can start I agree that it
makes sense to do what we believe it's right from a security standpoint and document it well (as I think you did).

@joshuagl
Copy link
Member

  • metadata does get written to disk before we know if it fulfills the "complete" requirements for expiry and meta version: this is not what the spec says but seems correct to me as the data is "trusted" (signed by threshold of correct keys) and spec documents similar process for root. Even if the final metadata really is expired and the update fails, if the remote later starts serving non-expired metadata, the client will succeed in a later update (and will correctly use the earlier expired metadata for rollback checks!)

Agree, this seems OK. The well-defined process for root is because we explicitly download intermediate versions we expect to be expired, whereas for the other top-level files we do not. There have been some preliminary efforts to define how to load existing downloaded metadata for rollback protection here theupdateframework/taps#128, but I think the size of the spec change is a blocker. I really want to find time to work on theupdateframework/specification#121

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The removal of root_update_finished() makes the API cleaner.

@sechkova what do you think?

Copy link
Contributor

@sechkova sechkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with these changes too.

@jku
Copy link
Member Author

jku commented Aug 24, 2021

Thank you, let's see how this works out!

@jku jku merged commit e242a75 into theupdateframework:develop Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ngclient: improve rollback protection
4 participants