Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to remove non-complient images from Jpeg test suite and introduce new validation check #1894

Closed
br3aker opened this issue Dec 14, 2021 · 6 comments · Fixed by #1899

Comments

@br3aker
Copy link
Contributor

br3aker commented Dec 14, 2021

Jpeg test suite has invalid image. It was introduced due to the index out of range exception but right now it's used to actually test jpeg decoder with reference image.

This image is invalid according to the itu spec:
image

As it has Hi parameter equal to 10:
image

Current implementation does not validate this value in any way and tries to decode spectral data with input parameters. Test suite assumes this image to be parsable just because libjpeg can parse it. Problem is that this image is small enough not to be affected by artifacts, any real world size image would produce pixel mess with invalid sampling rate. Thus this image should not be used to test decoder.

As for the solution, we can force invalid values to 1 or do % 4 but I don't think it's a good way to handle this. It's more convenient to get an exception of what is wrong (ideally with byte(s) start index) rather than random collection of pixels.

P.S.
This is not really relevant to the 'no itu spec compliance' reasoning but it also blocks my decoder optimization with ugly redundant checks.

@JimBobSquarePants
Copy link
Member

It's more convenient to get an exception of what is wrong (ideally with byte(s) start index) rather than random collection of pixels.

If libjpeg can actually load the thing; (System.Drawing doesn't seem capable - OOM, nor Windows 11, or Paint.NET) then I'm afraid that's off the table. We have to be able to decode anything libjpeg does otherwise people complain at us.

If it cannot, we should throw as you suggest. As I recall our reference was generated by our own decoder.

@antonfirsov
Copy link
Member

Note that #824 is a fuzz-test case, not a real world image. What's most important here is to make 100% sure to avoid buffer overflows and unexpected exceptions. If it really blocks a measurable optimization we can get away by throwing ImageFormatException instead of decoding mess. But again: there should be code that does necessary checks in advance and prevents IndexOutOfRangeException and buffer overflow.

As I recall our reference was generated by our own decoder.

I used ImageMagick to generate the jpeg references, since it uses libjpeg.

@JimBobSquarePants
Copy link
Member

That one I generated though.
https://github.com/SixLabors/ImageSharp/commits/master/tests/Images/Input/Jpg/issues/fuzz/Issue824-IndexOutOfRangeException-C.jpg

@br3aker
Copy link
Contributor Author

br3aker commented Dec 15, 2021

But again: there should be code that does necessary checks in advance and prevents IndexOutOfRangeException and buffer overflow.

In one of latest decoder code fixes I've checked that there's no overflow vulnerabilities in the spectral decoder code, marker sections decoder is still affected though. This whole issue is about the itu spec compliance.

I used ImageMagick to generate the jpeg references, since it uses libjpeg.

Don't they use turbo version? That might have a different way of handling things (I didn't check, just guessing).

If libjpeg can actually load the thing

Not sure about previous versions but latest code ver.9d from this site https://ijg.org/ won't decode this image as per this line:

    if (compptr->h_samp_factor<=0 || compptr->h_samp_factor>MAX_SAMP_FACTOR ||
	compptr->v_samp_factor<=0 || compptr->v_samp_factor>MAX_SAMP_FACTOR)
      ERREXIT(cinfo, JERR_BAD_SAMPLING);

MAX_SAMP_FACTOR is set to 4.

You can find it in the jdinput.c file via this func LOCAL(void) initial_setup (j_decompress_ptr cinfo) or ctrl+F MAX_SAMP_FACTOR.

Same goes for libjpeg-turbo:

https://github.com/libjpeg-turbo/libjpeg-turbo/blob/a01857cff6a2b7b3283ef1d0ecb6b01d06b597f8/jdinput.c#L69-L73

I didn't check it in practice as I don't have access to a PC with a C++ compiler right now - I will provide more info a bit later.

@antonfirsov
Copy link
Member

Don't they use turbo version? That might have a different way of handling things (I didn't check, just guessing).

They can build against different versions. I would assume Magick.NET uses libjpeg-turbo, @dlemstra can you confirm?

I'm fine throwing here, especially if latest libjpeg-turbo also throws.

@br3aker
Copy link
Contributor Author

br3aker commented Dec 16, 2021

Update: libjpeg-turbo does not decode this image with following error message: Bogus sampling factors. Marker decoding code of original libjpeg is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants