-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add restriction on 'clap' property #232
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea.
It should be part of the "Changes" section.
index.bs
Outdated
|
||
The semantics of the clean aperture property ('clap') as defined in [[!HEIF]] apply. In addition to the restrictions on transformative item property ordering specified in [[!MIAF]], the following restriction also applies: | ||
|
||
<assert>The origin of the 'clap' item property shall be anchored to 0,0 (top-left) unless the full, un-cropped image item is included as a secondary item in an image collection.</assert> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative: essential
shall be equal to 0 for a 'clap' item property. Readers shall treat a clap
item property as non-essential even if essential
is 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean it's fine for a parser to always ignore the clap
, even when the item is an input to some other derived item.
It would also mean it's fine for a parser to ignore a top-left anchored clap
, which would lead to padding being shown for images that need the clap
to deal with HW limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I was thinking about implementations that parse but ignore the clap
values and that do not expose the secondary non-hidden item.
As an example there is the libavif API that decodes to an avifImage
which has a clap
field. That clap
field is ignored by avifdec
which always outputs a PNG containing the full, uncropped image. I guess this is not compliant but I can imagine others doing the same, just accessing the pixels and ignoring the multiple obscure metadata chunks.
So I would have liked to adapt the spec to just accept the current state of things, at least for items that do not have further transformations, but that is beyond this PR I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see your point.
I think the real fix would be for avifdec
to apply the crop when writing to PNG though. From what I understand from the conversation where Chrome resurrected support for clap
, there was a real use-case where it was needed for some HW that had size limitations. The current behaviour from avifdec
will decode and show the padding for those files which I don't think is the correct behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Chrome applying clap
now? AOMediaCodec/libavif#1177 can be closed then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed AOMediaCodec/libavif#2427.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Chrome applying
clap
now? AOMediaCodec/libavif#1177 can be closed then.
I think so at least:
chromium/chromium@3a13337
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index.bs
Outdated
|
||
The semantics of the clean aperture property ('clap') as defined in [[!HEIF]] apply. In addition to the restrictions on transformative item property ordering specified in [[!MIAF]], the following restriction also applies: | ||
|
||
<assert>The origin of the 'clap' item property shall be anchored to 0,0 (top-left) unless the full, un-cropped image item is included as a secondary item in an image collection.</assert> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said during the call, I would have liked stronger constraints such as dimensions but this is too complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm that Chrome requires the clean aperture to be anchored at (0, 0) (top left) of the coded input image, and does not impose other requirements on the clean aperture.
We can modify Chrome to require that less than 64 pixels are cropped from the width or height of the coded input image. This will support encoders that require width or height to be a multiple of 64. After gaining experience with this constraint in Chrome, we can then add it to the AVIF spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can modify Chrome to require that less than 64 pixels are cropped from the width or height of the coded input image. This will support encoders that require width or height to be a multiple of 64. After gaining experience with this constraint in Chrome, we can then add it to the AVIF spec.
I understand the want for having stronger constraints, but we've had stuff like that come back and bite us multiple times in the past. We currently know of no AV1 HW that requires larger alignment than 64, but for HEVC we have had HW that requires horizontal alignments of:
- 64
- 128
- 192
- 256
And vertical alignments of:
- 64
- 96
- 128
Assuming that all future AV1 HW will stick to 64x64 sounds like a dangerous assumption to me.
Threat:
The main thing we want to protect against is some kind of editing service that promises to do lossless cropping of your image and that simply applies a clap
to the file without the user knowing that the full image is still present.
Valid use cases we want to be able to support:
- Dealing with HW limitations
- Useful tool for more advanced capture concepts
From my memory of the discussion, everyone could agree on two mitigations that would more or less completely mitigate the threat while still supporting the valid use-cases:
- Limit the
clap
to be anchored to the top-left corner. (This makes it useless as a general purpose cropping tool)
unless
- The full un-cropped image is also present and visible in the file. (This ensures that if used for generic cropping, the user is made aware that the content is still present.)
From meeting: Maybe we need to specify that the anchoring is top-left of input rather than output |
|
||
The semantics of the clean aperture property ('clap') as defined in [[!HEIF]] apply. In addition to the restrictions on transformative item property ordering specified in [[!MIAF]], the following restriction also applies: | ||
|
||
<assert>The origin of the 'clap' item property shall be anchored to 0,0 (top-left) of the input image unless the full, un-cropped image item is included as a secondary non-hidden image item.</assert> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leo, Yannis: Is it straightforward to check the second condition? It would be good to try implementing this check (at least mentally) to see how hard it is for compliant applications to enforce this requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
From HEIF:
A hidden image item has (flags & 1) equal to 1 in its ItemInfoEntry.
So that it is easy to check.
The presence of a clap
property and whether it covers the full image should be straightforward too.
However an image item can be hidden or not, and cropped or not, but these are properties tied to the item, so I do not see how to have a cropped version and a non-cropped version of the same image item. Is it why collections were mentioned at first? Can one crop a collection? I guess it could be done through iden
for example but checking if the full image is available through derived image items sounds a bit of work.
Should it be "two image items referring to the same coded image bytes"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it could be done through iden for example but checking if the full image is available through derived image items sounds a bit of work.
Should it be "two image items referring to the same coded image bytes"?
The way I envision someone doing this is per NOTE 2 in HEIF section 6.6.2.1 Identity derivation:
NOTE 2 A derived image of type 'iden' can be used, for example, when it is desirable to
have both the original version of a coded image item and a cropped version of the same
coded image item (obtained through the 'clap' transformative item property) as non-
hidden image items.
There are other ways it could be done, but this feels like the most straight forward and is relatively easy for a parser to check.
If someone uses an alternative scheme to do it, I think that should be allowed by the spec. It might not necessarily be implemented in all parsers, but that is something that can later be implemented if it turns out to be useful.
fc505b9
to
01fcade
Compare
SHA: adc56d5 Reason: push, by leo-barnes Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This closes #188
Preview | Diff