-
Notifications
You must be signed in to change notification settings - Fork 30
Update referrers API to return a list of descriptors #3
Update referrers API to return a list of descriptors #3
Conversation
Signed-off-by: Michael Brown <[email protected]>
2b20d0c
to
08b93b4
Compare
manifest-referrers-api.md
Outdated
|
||
The `referrers` API returns all artifacts that have a `subjectManifest` to given manifest digest. Referenced artifact requests are scoped to a repository, ensuring access rights for the repository can be used as authorization for the referenced artifacts. | ||
|
||
Artifact references are defined in the [oras.artifact.manifest spec][oras.artifact.manifest-spec] through the [`subjectManifest`][oras.artifact.manifest-spec-manifests] property. | ||
|
||
## Request All Artifact References | ||
|
||
The referrers api is sits alongside the [distribution-spec][oci-distribution-spec] paths avoiding any conflict with existing or new distribution apis. Pathing within the referrers api provides consistent repo/namespace paths, enabling registry operators to implement consistent auth access, using existing tokens for content. | ||
The referrers api is sits alongside the [distribution-spec][oci-distribution-spec] paths avoiding any conflict with existing or new distribution apis. Pathing within the referrers api provides consistent repo/namespace paths, enabling registry operators to implement consistent auth access, using existing tokens for content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to call out on how to include nextLink in the request parameters? as a query string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I would prefer that nextLink
actually be a nextToken
. I don't like APIs returning a link that clients will (possibly) follow blindly. I was going to propose a nextToken
which could be passed into a subsequent call to the API via a query parameter, most likely ?nextToken={nextToken}
. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a bit weird to invent a new pagination mechanism. Can you do the same thing that we do for tag listing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally agree 100% and did not know about the pagination that existed for the tags API. Will open a new issue (#4).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't think its weird to have a new pagination listing that does support richer capabilities and yes +1 on nextToken
. The distribution spec makes it harder for clients to follow and also makes it harder for a data provider to follow pagination downstream. ACR had quite a few issues and not totally not a fan. This is not under /v2 so does not need to follow that pagination mode.
@jonjohnsonjr - does GCR actually implement the tag listing - google/go-containerregistry#897 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tags listing is based on lexical ordering with next and last, I am not sure what ordering these reference artifacts will have so as to follow that pattern.
@michaelb990 Yes, nextToken or skipToken is commonly used as continuation tokens that are sent via query string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does GCR actually implement the tag listing
GCR doesn't implement tag listing because it returns the entire dataset, but it does implement pagination for catalog using the same pagination mechanism.
This is not under /v2 so does not need to follow that pagination mode.
I think that's a bit myopic if we want this to eventually get integrated into the spec. I don't really want to have to implement two different kinds of pagination within a client.
Tags listing is based on lexical ordering with next and last, I am not sure what ordering these reference artifacts will have so as to follow that pattern.
You could still use the Link
header, as existing clients already know to follow the Link
header as per https://docs.docker.com/registry/spec/api/#pagination
Compliant client implementations should always use the Link header value when proceeding through results linearly. The client may construct URLs to skip forward in the catalog.
...
The above process should then be repeated until the Link header is no longer set.
I'd be fine with not supporting lexical ordering (or really, any total ordering) as long as whatever pagination mechanism used here sets an appropriate Link
header such that I can just follow it. You can put whatever tokens you want in the Link
url.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate this discussion, but let's move it to #4. I didn't intend to change pagination with this PR, and tried to focus my changes on the format for the items in the list of references. Sounds like it definitely deserves more discussion on that issue before we make a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonjohnsonjr I see your point of Link
header and sounds like we can align . I think we can improve upon this and support returning a skipToken/nextToken in the Link header.
Link: <<url>?n=2&skipToken; rel="next"
GET /v2/<name>/tags/list?n=<n from the request>&nextToken=<>
Regarding ordering - Lexical doesn't make sense so it's worth calling out here. I would however lean towards reverse chronological since you typically want the newest first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate this discussion, but let's move it to #4. I didn't intend to change pagination with this PR, and tried to focus my changes on the format for the items in the list of references. Sounds like it definitely deserves more discussion on that issue before we make a PR.
+1 for splitting paging and the response type.
@@ -1,14 +1,14 @@ | |||
# Manifest Referrers API | |||
|
|||
[ORAS Artifact-manifest](./artifact-manifest.md) provides the ability to reference artifacts to existing artifacts. Reference artifacts include Notary v2 signatures, SBoMs and many other types. Artifacts that reference other artifacts SHOULD NOT be tagged, as they are considered enhancements to the artifacts they reference. To discover referenced artifacts a manifest referrers API is provided. An artifact client, such as a Notary v2 client would parse the returned manifests, determining which manifest type they will pull and process. | |||
[ORAS Artifact-manifest](./artifact-manifest.md) provides the ability to reference artifacts to existing artifacts. Reference artifacts include Notary v2 signatures, SBoMs and many other types. Artifacts that reference other artifacts SHOULD NOT be tagged, as they are considered enhancements to the artifacts they reference. To discover referenced artifacts a manifest referrers API is provided. An artifact client, such as a Notary v2 client would parse the returned manifest descriptors, determining which manifest type they will pull and process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelb990 we have been discussing about the pros and cons of returning manifest versus manifest descriptors. The data points that we were having are
- Clients are recommended to validate the data returned from server. This applies to the “artifactType” property that is returned in the response. If clients has to assure this value by querying the manifest, it would be cost effective if the original manifest is returned as part of response.
- Most flexibility for client-side filteration. Also, annotations being free form might complex the registry implementations to support indexing or filtering on them. The registry server might have to process the manifest to extract these annotations or other filterable properties. If so, it would be beneficial to return the entire manifest. This will extend the client side filtering beyond annotations of just the manifest but also on any of the enclosed blobs.
- This option doesn't help with the response size problem to the full extent. The E2E scenarios call out filtering based on both “artifactType” and annotations. The cost of returning annotations versus the entire manifest is same on the registry server. However, the latter option has a benefit of optimizing the network roundtrips on clients.
Therefore, we are leaning towards returning the manifest as part of the list result and we have a spec for our proposal.
https://gist.github.com/aviral26/ca4b0c1989fd978e74be75cbf3f3ea92 . Can you please let us know your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the best of both worlds is to return a list of descriptors with the data inlined: opencontainers/image-spec#826
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be open to a list of descriptors and a second call to get the manifest OR a list of descriptors that use the data
field to inline the manifest (maybe optionally based on the size of the manifest, but that might just get complicated). I would prefer not introducing a new descriptor-like object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently data
has not finalized and we can consider it when #826 is settled and we can assume that manifest descriptors can be query from the /manifests
api and/or /blobs
We should be free to decide on another property (e.g. palyoad
) if needed. Moving to ORAS was specifically to not block on OCI but generally work with the intention of moving things back to OCI.
I think the item we discussed about returning annotations indexing content from the manifest. When content is encrypted with customer keys hoisting content from customers manifest into an indexed set was challenging and that is why we were considering returning the full encoded manifest inside a property upto a max response size of say 10MB. From a specification perspective - this is optional and the client will follow the digest to retrieve the manifest if it is not present.
Also we didn't want to use data primarily due to the OCI specification ambiguity.
Lastly regarding descriptor - Since the descriptor doesn't enforce list of properties it will be hard to implement strict client libraries if the server returns fields that they is not understood. I am actually leaning towards explicitly calling out the fields that will be returned in this response so that clients aren't expected to ignore fields as per descriptor specifications.
https://github.com/opencontainers/image-spec/blob/main/descriptor.md#reserved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and we can assume that manifest descriptors can be query from the /manifests api and/or /blobs
I don't follow this. Is that related to opencontainers/image-spec#826 or non-sequitur?
I am actually leaning towards explicitly calling out the fields that will be returned in this response so that clients aren't expected to ignore fields as per descriptor specifications.
I'm confused about this. You're essentially freezing the spec indefinitely, and it makes upgrades almost impossible. I may be biased here because most of my professional career has revolved around protobufs, but ignoring unknown fields seems like a necessity if you have any kind of protocol that you hope to evolve: https://developers.google.com/protocol-buffers/docs/proto#updating
Any new fields that you add should be optional or repeated. This means that any messages serialized by code using your "old" message format can be parsed by your new generated code, as they won't be missing any required elements. You should set up sensible default values for these elements so that new code can properly interact with messages generated by old code. Similarly, messages created by your new code can be parsed by your old code: old binaries simply ignore the new field when parsing. However, the unknown fields are not discarded, and if the message is later serialized, the unknown fields are serialized along with it – so if the message is passed on to new code, the new fields are still available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently data has not finalized and we can consider it when #826 is settled
I think that the efforts to define the data
field in a manifest in opencontainers/image-spec#826 doesn't have to block us from using it in a different context, specifically in the list of references for a manifest.
If we made the referrers API a list of descriptors that included the data
field (and defined it to mean the base64 encoded manifest), then we would get effectively what you're proposing, but be able to re-use an OCI object instead of having to invent a new one. It would look like this:
{
"references": [
{
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
"mediaType": "application/vnd.oci.artifact.manifest.v1+json",
"size": 312,
"data": "ICAgIHsKICAgICAgZGlnZXN0OiBzaGEyNTY6M2MzYTQ2MDRhNTQ1Y2RjMTI3NDU2ZDk0ZTQyMWNkMzU1YmNhNWI1MjhmNGE5YzE5MDViMTVkYTJlYjRhNGM2YiwKICAgICAgbWVkaWFUeXBlOiBhcHBsaWNhdGlvbi92bmQub2NpLmFydGlmYWN0Lm1hbmlmZXN0LnYxK2pzb24sCiAgICAgIGFydGlmYWN0VHlwZTogY25jZi5ub3RhcnkudjIsCiAgICAgIHNpemU6IDMxMiwKICAgIH0sCg=="
},
{
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
"mediaType": "application/vnd.oci.artifact.manifest.v1+json",
"size": 237,
"data": "ICAgIHsKICAgICAgZGlnZXN0OiBzaGEyNTY6M2MzYTQ2MDRhNTQ1Y2RjMTI3NDU2ZDk0ZTQyMWNkMzU1YmNhNWI1MjhmNGE5YzE5MDViMTVkYTJlYjRhNGM2YiwKICAgICAgbWVkaWFUeXBlOiBhcHBsaWNhdGlvbi92bmQub2NpLmFydGlmYWN0Lm1hbmlmZXN0LnYxK2pzb24sCiAgICAgIGFydGlmYWN0VHlwZTogZXhhbXBsZS5zYm9tLnYwLAogICAgICBzaXplOiAyMzcsCiAgICAgIGRhdGE6IGFHVnNiRzg9CiAgICB9Cg=="
}
],
"@nextLink": "{opaqueUrl}"
}
This would also match exactly with the existing descriptor and not require defining which annotations get up-lifted OR adding a new artifactType
field.
and we can assume that manifest descriptors can be query from the /manifests api and/or /blobs
Descriptors returned by the references API will always be for manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, "data" field is optional, constrained to size limits set by registry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently data has not finalized
I think the semantics are pretty settled and unlikely to change (we have had a quorum to merge, barring some wordsmithing in the considerations). The discussion is mostly centered around how registry operators feel about the field, which hopefully we can resolve via opencontainers/distribution-spec#293
Given that the semantics here are identical to the proposal, I think it's fine (and good) to use data
in this way.
I think that the efforts to define the data field in a manifest in opencontainers/image-spec#826 doesn't have to block us from using it in a different context, specifically in the list of references for a manifest.
That PR isn't about defining the data
field in a manifest, it's about defining the data
field for any descriptor in any context. Your use here is precisely one of the intended use cases.
Just to clarify, "data" field is optional, constrained to size limits set by registry?
That's my understanding. Clients should be prepared to fetch any content in the response via subsequent requests, but if the registry chooses to inline the associated content for a descriptor in the data
field, the client can save the roundtrip.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the efforts to define the data field in a manifest in opencontainers/image-spec#826 doesn't have to block us from using it in a different context, specifically in the list of references for a manifest.
This context of using data
as a response type makes more sense, as it's not a required duplication of content persisted in the registry, rather a payload optimization.
If we made the referrers API a list of descriptors that included the data field (and defined it to mean the base64 encoded manifest), then we would get effectively what you're proposing, but be able to re-use an OCI object instead of having to invent a new one. It would look like this:
I generally like this. The one concern is how do we account for artifactType
filtering?
If the artifactType
is embedded in the OPTIONAL data
element, and the data
element is empty as it exceeded the max capacity, how does a client know to filter cncf.notary.v2
signatures, vs spdx.sbom
or other artifactTypes
?
I'm thinking we should include the artifactType
in the descriptor so it's always available with/without the data
element being populated.
{
"references": [
{
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
"mediaType": "application/vnd.oci.artifact.manifest.v1+json",
"size": 312,
"artifactType": "cncf.notary.v2",
"data": "ICAgIHsKICAgICAgZGlnZXN0OiBzaGEyNTY6M2MzYTQ2MDRhNTQ1Y2RjMTI3NDU2ZDk0ZTQyMWNkMzU1YmNhNWI1MjhmNGE5YzE5MDViMTVkYTJlYjRhNGM2YiwKICAgICAgbWVkaWFUeXBlOiBhcHBsaWNhdGlvbi92bmQub2NpLmFydGlmYWN0Lm1hbmlmZXN0LnYxK2pzb24sCiAgICAgIGFydGlmYWN0VHlwZTogY25jZi5ub3RhcnkudjIsCiAgICAgIHNpemU6IDMxMiwKICAgIH0sCg=="
},
{
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
"mediaType": "application/vnd.oci.artifact.manifest.v1+json",
"size": 237,
"artifactType": "spdx.sbom.v3",
"data": "ICAgIHsKICAgICAgZGlnZXN0OiBzaGEyNTY6M2MzYTQ2MDRhNTQ1Y2RjMTI3NDU2ZDk0ZTQyMWNkMzU1YmNhNWI1MjhmNGE5YzE5MDViMTVkYTJlYjRhNGM2YiwKICAgICAgbWVkaWFUeXBlOiBhcHBsaWNhdGlvbi92bmQub2NpLmFydGlmYWN0Lm1hbmlmZXN0LnYxK2pzb24sCiAgICAgIGFydGlmYWN0VHlwZTogZXhhbXBsZS5zYm9tLnYwLAogICAgICBzaXplOiAyMzcsCiAgICAgIGRhdGE6IGFHVnNiRzg9CiAgICB9Cg=="
}
],
"@nextLink": "{opaqueUrl}"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two general approaches here:
- Return a list of artifact-descriptors, which add
artifactType
to the image-descriptor to differentiate each artifact from a common manifestType. (SBOM formats, Scan Results, Signature Formats, TBD) - Return a collection of manifests, (binary encoded)
The implementation discussion has surfaced a number of challenges.
- Customer Managed Key Encryption (CMK) has complicated some of the designs for what can be stored as clear text for indexing, used by the
referres
api. - When/where would the annotations be lifted from the manifest if returned in an artifact-descriptor response?
- How many round trips should a client make? Where's the balance?
While I was thinking the modified artifact-descriptor would be the best balance, I'm now more inclined to return encoded manifests.
It does mean the response of the referrers
API will be larger, but it reduces compute on the registry for when to lift the annotations and reduces clients from having to query the manifest for each reference type the client requires.
It solves the annotation encryption issue as these aren't indexed and lifted into an artifact-descriptor.
descriptor.md
Outdated
@@ -40,8 +40,11 @@ The following fields contain the primary properties that constitute an Artifact | |||
|
|||
- **`annotations`** *string-string map* | |||
|
|||
This OPTIONAL property contains arbitrary metadata for this descriptor. | |||
This OPTIONAL property MUST use the [annotation rules](annotations.md#rules). | |||
This OPTIONAL property contains the annotations from the annotations field in the artifact manifest for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good clarification for which annotations are lifetd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating, based on recent conversations.
With the proposal to OPTIONALLY return the encoded manifest as a data
property, and the complexity of server-side processing to lift annotations, I think we're now saying there's no lifting, rather the annotations are only available within the manifest. This simplifies the response.
manifest-referrers-api.md
Outdated
need to determine which artifact to retrieve. For example, Notary v2 MAY use an annotation: | ||
`"org.cncf.notary.v2.signature.subject": "wabbit-networks.io"`, which the client could use to determine which signature | ||
to pull from the registry. Using annotations, clients can reduce round trips and the data returned to determine which | ||
artifacts the specific client may require reducing network traffic and API calls. Future support of the `data` field in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we defer the data
element out of this conversation? Combining too many concepts only complicates the scope of what we need to land now.
Request removing:
Future support of the
data
field in the list of layers/blobs will allow for a signature or other "small" payload to be retrieved with only a single additional call to the registry (a GET for the full manifest).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above usage of data is fair. As long as we define that the data element in this context is used to return encoded manifests and can be queried from the /manifests API.
Also ensure that this is OPTIONAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: the artifacts-spec, and the referrers/
api only return cncf.oras.artifact-manifests
as they're the only type of manifests that support the subjectManifest
. The discussion to use data
to be an OPTIONAL property, that returns the encoded manifest is a different conversation from the manifest including the signature in a duplicative descriptor.data
element of a signature perissted as a blob.
So, I still think we should remove the text regarding future support of the data
filed as it's not relevant to the response discussion.
descriptor.md
Outdated
@@ -40,8 +40,11 @@ The following fields contain the primary properties that constitute an Artifact | |||
|
|||
- **`annotations`** *string-string map* | |||
|
|||
This OPTIONAL property contains arbitrary metadata for this descriptor. | |||
This OPTIONAL property MUST use the [annotation rules](annotations.md#rules). | |||
This OPTIONAL property contains the annotations from the annotations field in the artifact manifest for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating, based on recent conversations.
With the proposal to OPTIONALLY return the encoded manifest as a data
property, and the complexity of server-side processing to lift annotations, I think we're now saying there's no lifting, rather the annotations are only available within the manifest. This simplifies the response.
@@ -1,14 +1,14 @@ | |||
# Manifest Referrers API | |||
|
|||
[ORAS Artifact-manifest](./artifact-manifest.md) provides the ability to reference artifacts to existing artifacts. Reference artifacts include Notary v2 signatures, SBoMs and many other types. Artifacts that reference other artifacts SHOULD NOT be tagged, as they are considered enhancements to the artifacts they reference. To discover referenced artifacts a manifest referrers API is provided. An artifact client, such as a Notary v2 client would parse the returned manifests, determining which manifest type they will pull and process. | |||
[ORAS Artifact-manifest](./artifact-manifest.md) provides the ability to reference artifacts to existing artifacts. Reference artifacts include Notary v2 signatures, SBoMs and many other types. Artifacts that reference other artifacts SHOULD NOT be tagged, as they are considered enhancements to the artifacts they reference. To discover referenced artifacts a manifest referrers API is provided. An artifact client, such as a Notary v2 client would parse the returned manifest descriptors, determining which manifest type they will pull and process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the efforts to define the data field in a manifest in opencontainers/image-spec#826 doesn't have to block us from using it in a different context, specifically in the list of references for a manifest.
This context of using data
as a response type makes more sense, as it's not a required duplication of content persisted in the registry, rather a payload optimization.
If we made the referrers API a list of descriptors that included the data field (and defined it to mean the base64 encoded manifest), then we would get effectively what you're proposing, but be able to re-use an OCI object instead of having to invent a new one. It would look like this:
I generally like this. The one concern is how do we account for artifactType
filtering?
If the artifactType
is embedded in the OPTIONAL data
element, and the data
element is empty as it exceeded the max capacity, how does a client know to filter cncf.notary.v2
signatures, vs spdx.sbom
or other artifactTypes
?
I'm thinking we should include the artifactType
in the descriptor so it's always available with/without the data
element being populated.
{
"references": [
{
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
"mediaType": "application/vnd.oci.artifact.manifest.v1+json",
"size": 312,
"artifactType": "cncf.notary.v2",
"data": "ICAgIHsKICAgICAgZGlnZXN0OiBzaGEyNTY6M2MzYTQ2MDRhNTQ1Y2RjMTI3NDU2ZDk0ZTQyMWNkMzU1YmNhNWI1MjhmNGE5YzE5MDViMTVkYTJlYjRhNGM2YiwKICAgICAgbWVkaWFUeXBlOiBhcHBsaWNhdGlvbi92bmQub2NpLmFydGlmYWN0Lm1hbmlmZXN0LnYxK2pzb24sCiAgICAgIGFydGlmYWN0VHlwZTogY25jZi5ub3RhcnkudjIsCiAgICAgIHNpemU6IDMxMiwKICAgIH0sCg=="
},
{
"digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b",
"mediaType": "application/vnd.oci.artifact.manifest.v1+json",
"size": 237,
"artifactType": "spdx.sbom.v3",
"data": "ICAgIHsKICAgICAgZGlnZXN0OiBzaGEyNTY6M2MzYTQ2MDRhNTQ1Y2RjMTI3NDU2ZDk0ZTQyMWNkMzU1YmNhNWI1MjhmNGE5YzE5MDViMTVkYTJlYjRhNGM2YiwKICAgICAgbWVkaWFUeXBlOiBhcHBsaWNhdGlvbi92bmQub2NpLmFydGlmYWN0Lm1hbmlmZXN0LnYxK2pzb24sCiAgICAgIGFydGlmYWN0VHlwZTogZXhhbXBsZS5zYm9tLnYwLAogICAgICBzaXplOiAyMzcsCiAgICAgIGRhdGE6IGFHVnNiRzg9CiAgICB9Cg=="
}
],
"@nextLink": "{opaqueUrl}"
}
manifest-referrers-api.md
Outdated
|
||
The `referrers` API returns all artifacts that have a `subjectManifest` to given manifest digest. Referenced artifact requests are scoped to a repository, ensuring access rights for the repository can be used as authorization for the referenced artifacts. | ||
|
||
Artifact references are defined in the [oras.artifact.manifest spec][oras.artifact.manifest-spec] through the [`subjectManifest`][oras.artifact.manifest-spec-manifests] property. | ||
|
||
## Request All Artifact References | ||
|
||
The referrers api is sits alongside the [distribution-spec][oci-distribution-spec] paths avoiding any conflict with existing or new distribution apis. Pathing within the referrers api provides consistent repo/namespace paths, enabling registry operators to implement consistent auth access, using existing tokens for content. | ||
The referrers api is sits alongside the [distribution-spec][oci-distribution-spec] paths avoiding any conflict with existing or new distribution apis. Pathing within the referrers api provides consistent repo/namespace paths, enabling registry operators to implement consistent auth access, using existing tokens for content. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The referrers api is sits alongside the
api is sits
manifest-referrers-api.md
Outdated
need to determine which artifact to retrieve. For example, Notary v2 MAY use an annotation: | ||
`"org.cncf.notary.v2.signature.subject": "wabbit-networks.io"`, which the client could use to determine which signature | ||
to pull from the registry. Using annotations, clients can reduce round trips and the data returned to determine which | ||
artifacts the specific client may require reducing network traffic and API calls. Future support of the `data` field in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: the artifacts-spec, and the referrers/
api only return cncf.oras.artifact-manifests
as they're the only type of manifests that support the subjectManifest
. The discussion to use data
to be an OPTIONAL property, that returns the encoded manifest is a different conversation from the manifest including the signature in a duplicative descriptor.data
element of a signature perissted as a blob.
So, I still think we should remove the text regarding future support of the data
filed as it's not relevant to the response discussion.
To help frame the conversation, I've added a discussion: Artifact-spec |
e55357f
to
d91cd37
Compare
Signed-off-by: Michael Brown <[email protected]>
d91cd37
to
3acd83c
Compare
Signed-off-by: Michael Brown <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @michaelb990 for capturing the heart of the latest round of discussions. This pivots the response back to using descriptors, which does require a round trip for the manifest. But, it also removes the problematic un-encoded manifest
response that couldn't be trusted, requiring a round trip for the manifests the client did need. While providing non-trusted manifests helped provide client-side filtering for which manifests needed to be returned, it didn't meet the balance.
Based on the various discussions, including the [baseline requirements]#7), I think we've landed on:
- server-side filtering of
artifactType
is optional, so it needs to be returned for client-side filtering. Without theartifactType
, the client would have to retrieve all manifests to parse theartifactType
, allowing the notary client to limit pulling notary manifests, while an sbom client can limit pulling sbom manifest. - there are some encryption scenarios, which need further verification whether registry operators can index the
artifactType
so it may not be returned, requiring the client to pull all manifests, to identify which manifests apply to the specific client. - the
data
element could be an optional way to embed the encoded manifest, reducing the additional round trip for retrieving artifact specific manifest, However, depending on the size of the manifests, the server may exclude thedata
from thereferrers/
API response. This means clients must be capable of pulling the manifest anyway. By deferring the encodeddata
element until later, we're not losing a user scenario, just deferring a possible optimization that not all clients would be able to take advantage of.
I noticed a few inconsistencies in some names (all nits and likely from my orirignal PR), like application/vnd.cncf.notary , cncf.notary
vs. org.cncf.notary
. Just to get this PR merged, I'd suggest we LGTM this, and I can do a quick scrub of the nits and PR the changes for consistency.
LGTM
Updated this PR to remove the manifest encoding for now and just return a list of descriptors. I will watch opencontainers/image-spec#826 to see how the |
As our discussion for support of The use of the One of my concerns with It's the possibility that an optional encoded manifest that may be returned in the referrers API may include duplicative |
Hey Steve, I just want to clarify that the As I've explained before, registries have the flexibility to reject manifests that they deem too large, if storing them would be burdensome, and we've discussed trying to find a reasonable size that client can rely on. I would encourage you to join in that conversation: opencontainers/distribution-spec#293 The concerns around unwinding the value of deduplication are pretty insubstantial. Data denormalization is a common optimization in distributed systems with well understood trade-offs. While I agree with you that more deduplication is something to be desired, your concerns regarding data embedded within a manifest are a drop in the bucket compared to what we're currently leaving on the table by continuing to distribute images as compressed tarballs. Given your passion for maximal deduplication, I would encourage you pick up the torch on revisiting how layers are encoded. Some more context:
|
Fixes #2 .