Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for automatic content discovery for cross-mounting blobs #275

Merged
merged 3 commits into from
Jun 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,16 @@ choose to expire it after, for example, a minute or an hour, in the case that yo
**Q: What happens if the `<tagname>` (last) parameter does not exist?**

There is no suggested behavior in the specification for what to do if the tag does not exist. Registries might consider ignoring te parameter, or assuming a non-existing tag is at the start or the end of the sorted list. In the first case, at the start of the list would imply returning the entire set of tags. In the second cast, at the end of the list would imply returningan empty list, as it references the last tag onward (an empty set).

**Q: How are clients expected to adopt (and probe for) automatic mount origin discovery?**

The process of mounting a blob is supposed to fail in such a way that if a blob cannot be cross-mounted, the registry the registry initiates an upload.
Clients should try to use the automatic content mount origin discovery mechanism when they do not know of an origin in the registry with the requisite blob.
Nonconformant registries may return a non-201 or non-202 error code.
If the client is trying to be defensive to nonconformant registries, and receives a non-201 or non-202 error code, it should fall back to [pushing the blob](https://github.com/opencontainers/distribution-spec/blob/main/spec.md#pushing-blobs).

**Q: How come `from` is required on cross-repo mount for some registries?**

Mounting without having to specify `from`, also known as automatic mount origin discovery, requires the registry to determine whether or not a blob exists in any repository.
If the existence check for the blob is done first, an immediate failure will indicate the lack of presence of a blob.
On the other hand, if the registry needs to perform further work to determine if the blob can be accessed by the mounter, it could create an information disclosure risk, in leaking that presence of a blob with that digest in the registry.
4 changes: 3 additions & 1 deletion spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ Here, `<blob-location>` is a pullable blob URL.

##### Mounting a blob from another repository

If a necessary blob exists already in another repository, it can be mounted into a different repository via a `POST`
If a necessary blob exists already in another repository within the same registry, it can be mounted into a different repository via a `POST`
request in the following format:

`/v2/<name>/blobs/uploads/?mount=<digest>&from=<other_name>` <sup>[end-11](#endpoints)</sup>.
Expand All @@ -376,6 +376,8 @@ The Location header will contain the registry URL to access the accepted layer f
header returns the canonical digest of the uploaded blob which MAY differ from the provided digest. Most clients MAY
ignore the value but if it is used, the client SHOULD verify the value against the uploaded blob data.

The registry MAY treat the `from` parameter as optional, and it MAY cross-mount the blob if it can be found.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change if it can be found to if it can be found and can be accessed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me, that ACL implication is included in can be found, because ideally if it can not be accessed then it should be found. Or is there another use case you're thinking about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it is outside of the specification, I would suggest that we indicate that it is not a good idea to implement this feature on any registry with cross-repo ACLs due to the possibility of information disclosure. I do not think found implies any notion of "access" in the sense of "Security"

So, I would rather say something like:
"If the registry does not implement a security model which allows for attenuation of access to reading blobs, it MAY treat the from parameter as optional, and it MAY cross-mount the blob if it can be found."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on adding more to wording here. While we don't define an ACL model, it is a potential pitfall with a very small change that is worth calling out. The updated language is reasonable, but I think it can be simplified some to just mentioned the read access. Like @jonjohnsonjr brought it, it could be that the registry only searches in a list of a blob that are known public and therefore able to be read by anyone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could leverage this for MCR/ACR as well and for avoiding pushing these public well known layers multiple times.


Alternatively, if a registry does not support cross-repository mounting or is unable to mount the requested blob,
it SHOULD return a `202`. This indicates that the upload session has begun and that the client MAY proceed with the upload.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to specify the 202 response will also include a Location header for uploading the blob?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's unrelated to this change, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit of the "else" path we hit, when the blob isn't found. But yes, to your point, that path isn't being changed with this PR, so we could clarify that in a separate PR.


Expand Down