Automate compression method detection #1616

pleprince · 2024-01-18T18:22:15Z

This change lets cmake generate ImfCompression.h and ImfCompressor.cpp based on metadata found in each compressor's header (i.e. ImfRleCompression.h). One header can declare multiple compressors.

Additional metadata has been added to all compression headers and new compression methods should be able to leverage this mecanism easily.
The binaries have been updated to leverage the new system. For example, new compression methods are automatically recognized by exrmaketiled and displayed in the command-line help.

This is a pure cmake implementation (no additional dependency).

pleprince · 2024-01-18T19:37:48Z

I left this PR as draft, as people tend to have strong views about building stuff.

peterhillman · 2024-01-18T21:06:31Z

That's really neat! Those functions that detail the compression types are going to be really useful. The automatic generation certainly simplifies adding a compression type, but that doesn't happen very frequently.
When Cmake parses the compression IDs, it detects duplicate IDs but should it also detect gaps? E.g. if there's no compression type 12 but there is a 13? I think user code assumes that the compression codes are contiguous (for example when populating menus of compression types) and some of the tests in the library do the same. Also NUM_COMPRESSION_METHODS would be wrong/misleading if there was a break. If they are contiguous, IdToDesc can be a vector<CompressionDesc> instead of a map.

I wonder whether it might be worth moving CompressionDesc, IdToDesc and CompressionNameToId to ImfCompressor.cpp, i.e. make them non-exported private symbols rather than part of the public API? The implementation of the new methods would also become private in ImfCompressor.cpp, rather than inline in ImfCompression.h. That would allow for changes to the CompressionDesc structure etc without breaking the ABI. (For example if GPU-compatible codecs are added in the future it would be good to identify which ones are supported)

pleprince · 2024-01-19T11:33:07Z

Thanks for all the suggestions @peterhillman !
I added sanity checking for gaps in the id list: it will make the build fail.
Next, I will look into your second suggestion.

I also noticed two CI problems:

The SNYK step is failing. No idea how to resolve that.
The bazel build is failing, which suggest I will have to implement the same mecanism in bazel ? Maybe @Vertexwahn can advise ?

See CMakeLists.txt for details. Signed-off-by: Philippe Leprince <[email protected]>

Signed-off-by: Philippe Leprince <[email protected]>

@peterhillman

suggested by @peterhillman. Signed-off-by: Philippe Leprince <[email protected]>

Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-01-19T12:27:12Z

@peterhillman Shouldn't I also generate src/lib/OpenEXR/ImfCRgbaFile.h ? I am not familiar with it but it looks like a subset of the c++ implementation...

Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-01-19T14:28:47Z

If they are contiguous, IdToDesc can be a vector<CompressionDesc> instead of a map.

Not really: we are using the map to quickly query a compression method's specs in getCompressionNameFromId(), getCompressionDescriptionFromId(), isValidCompressionId(), getCompressionNamesString(), getCompressionNumScanlines(), isLossyCompressionId() and isDeepCompressionId().

peterhillman · 2024-01-20T22:39:18Z

@pleprince yes, you did need to generate ImfCRgbaFile too, since that's a pure C wrapper so can't include the C++ files. Speaking of C, the OpenEXRCore implementation would still have hard-coded compression types. If/when the C++ library is reimplemented on top of OpenEXRCore, then all this automatic compression method stuff would disappear, unless it was reimplemented in OpenEXRCore. Doing a similar thing with CMake parsing files also seems possible for OpenEXRCore, but would be easier if there was some refactoring to centralize all compression logic into a single file, to limit the amount of files that CMake has to mess with at configure time. Maybe @kdt3rd has thoughts about that?

With IdToDesc, I understood that it's a map indexed by an enum. I was thinking you could make a vector of CompressionDescs, and index it by casting Compression to an unsigned int n, checking n is less than NUM_COMPRESSION_METHODS, then reading IdToDesc[n]. In fact, IdToDesc can be a plain const array, rather than a stl vector, which should be even less overhead.

And regarding bazel and other build systems, it isn't necessary that the files are generated every time CMake is run, since they should be the same for everyone who is building the library. Is there a way to arrange it so library developers can manually run the file generation step using CMake whenever they modify the compression code, and commit the resultant files into the repository? Then only a CMake implementation would be required, no matter what build system is used elsewhere.

Vertexwahn · 2024-01-21T23:13:59Z

@pleprince

I wonder if the same thing could not be achieved with pure C++ template or macro magic? For instance Mitsuba has a nice MI_EXPORT_PLUGIN macro. We could have some kind of EXR_REGISTER_COMPRESSION macro. I would really prefer a solution that is build system independent. Look for instance at OpenEXRCore/openexr_version.h. In the old days CMake defined the version and generated a version.h file. Now it`s the other way around - we define the version in the C++ files and CMake has to read it.
That makes it easier to deal with different build systems (think about make, Autotools, SCons, Premake, Ninja, Meson, FASTbuild, Buck2, ...)
Reading the configuration from C++ comments via CMake to generate a C++ header feels awkward. I advice you to rethink this and not accept the change in the current form.

If you see code generation as the only option I would go with Python (or C++) (there are IDEs with nice debugging & refactoring support for Python - this is not the case for CMake). We could then run the Python script to update the compressors. The Python script could also be called from a CMake build or a Bazel build, e.g.:

py_binary(
    name = "my_code_generator",
    srcs = ["my_code_gen.py"],
)

Regarding the Bazel integration - would go with Python here:

py_binary(
    name = "generate_ImfCompression_h",
    srcs = ["generate_ImfCompression_h.py"],
)

genrule(
    name = "generate_ImfCompression_h_gen",
    outs = ["ImfCompression.h"],
    cmd = "./$(location //:generate_ImfCompression_h) > \"$@\"",
    tools = ["//:generate_ImfCompression_h"],
)

cc_binary(
    name = "OpenEXRDemo",
    srcs = [
    	...
    	"ImfCompression.h",
    ],
)

generate_ImfCompression_h.py:

#!/usr/bin/env python3

print("print text that should go into ImfCompression.h")

Another option would be to use "pure" Bazel (Starlark) or to include a header that was pre-genreated with CMake.

cary-ilm · 2024-01-22T00:35:32Z

Looks like this uses cmake_path which was added in cmake 3.20, but OpenEXR currently requires only v3.12, hence the build failure.

Could you possibly provide an alternate implementation?

pleprince · 2024-01-22T08:29:43Z

@Vertexwahn
I read a bit about bazel over the weekend and it seems, by design, incapable of doing the same thing.
I will certainly NOT use cpp to solve this: having had to maintain large macro-based setups, I know how painfully confusing and hard to maintain it can get.
@peterhillman' suggestion to create a run-once setup seems more promising.

Note that I had originally written a python script to do this, but realized it was introducing a new build-time dependency.

Signed-off-by: Philippe Leprince <[email protected]>

@cary-ilm

… by @cary-ilm. Signed-off-by: Philippe Leprince <[email protected]>

All files are generated during configuration, making them available for bazel. See comments at the top of src/lib/OpenEXR/CMakeLists.txt. Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-01-22T14:46:56Z

I should have addressed all comments now. I will now write some tests for sanity checking.

pleprince · 2024-01-22T18:07:52Z

I am a bit confused... How do you make a test fail ?
If I throw an exception, I get:
3/16 Test #3: OpenEXR.testCompressionApi .......Subprocess aborted***Exception: 0.11 sec
... but all the tests have some sort of exception handling or don't pass a non-0 return code. That basically means that tests ALWAYS PASS if they don't crash.
Thanks for your help.

cary-ilm · 2024-01-22T18:10:14Z

Typically tests use assert.

pleprince · 2024-01-22T21:17:19Z

It can not work in release mode: assert() is a macro that evaluates to nothing when NDEBUG is defined.
Are you running tests in debug mode ?

cary-ilm · 2024-01-22T21:42:53Z

Test files all do this:

#ifdef NDEBUG
#    undef NDEBUG
#endif

src/lib/OpenEXR/ImfCompression.h.in

src/lib/OpenEXR/ImfCRgbaFile.h

@peterhillman

spotted by @peterhillman Signed-off-by: Philippe Leprince <[email protected]>

@peterhillman

suggested by @peterhillman. Signed-off-by: Philippe Leprince <[email protected]>

Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-01-23T13:36:17Z

There are a number of other tests that could use the compression API function to enumerate appropriate compression methods:

src/test/OpenEXRCoreTest/deep.cpp
src/test/OpenEXRTest/testCopyDeepScanLine.cpp
src/test/OpenEXRTest/testCopyDeepTiled.cpp
src/test/OpenEXRTest/testDeepScanLineBasic.cpp
src/test/OpenEXRTest/testDeepScanLineHuge.cpp
src/test/OpenEXRTest/testDeepTiledBasic.cpp

I also left aside src/wrappers/python/Imath.py, although it enumerates the compression methods and could be generated too.

What do you reckon ?

@peterhillman

Suggested by @peterhillman. Signed-off-by: Philippe Leprince <[email protected]>

Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-01-23T16:33:49Z

If the CI passes, I will remove the draft status of this PR.
@Vertexwahn I'll let you improve the bazel build, if you want to. I tried to add a genrule locally to run the cmake step, but no luck with that. If anything bazel makes me appreciate cmake... 😆

peterhillman · 2024-01-23T21:04:34Z

An alternative to generating Imath.py could be adding a test that makes sure that the python Compression object is consistent with the C++ types. That way, Imath.py stays freely editable, but a new compression type cannot be added without manually adding it to Imath.py too.

The OpenEXRCoreTest is tricky because the Core library itself doesn't have access to the compression API. Again, a test could be generated to confirm the C++ and Core APIs are consistent.

Personally I'd be a little wary of changing existing tests: although it's easy to confirm that test still passes after the change, it's harder to confirm it still correctly detects all the same issues the original test did. Perhaps it would be better to update those when new compression types are added.

cary-ilm · 2024-01-23T23:29:58Z

This implementation of Imath.py needs to change significantly, so I wouldn't bother with it now. It's a stub substituting for the proper Imath module, but it also contains stuff it shouldn't. Compression types shouldn't appear in Imath. This is part of the ongoing adoption of the openexr python bindings, still in progress.

pleprince · 2024-01-24T14:53:33Z

Thanks for the feedback @peterhillman and @cary-ilm .
I will leave things as they are and hit "Ready for review".

…n-method-detection

pleprince · 2024-01-25T13:43:13Z

I updated the PR with the latest in main, to make sure all tests and workflows pass. Cheers.

pleprince · 2024-02-14T09:06:05Z

Hi @peterhillman & @cary-ilm
Is this still under consideration for merging ? Is there anything more I should do ?
Cheers

cary-ilm · 2024-02-14T19:05:44Z

@pleprince, we discussed this briefly in the last steering committee meeting. We're concerned that is a fair bit of overhead and additional infrastructure with potential future maintenance burden, all to facilitate something that happens very rarely, since we don't expect to add compression types to the format frequently. It's true that we're contemplating a couple instances of that now, your zstd contribution in particular, but our sense is that we'd actually prefer the more straightforward brute force approach. I appreciate all the careful thought you put into the design. @peterhillman, can you make any further recommendations?

peterhillman · 2024-02-14T20:03:10Z

I think API extensions are definitely worthwhile, providing methods for accessing names for each compression type and whether they are lossy or lossless etc, and we should keep those. The test you added makes it easy to spot a mistake in manually updating files whenever there's a new compression type.
So, maybe remove the automatic generation from the cmake file, and the .in files it reads, but basically leave the other files as you have them?

pleprince · 2024-02-15T09:55:38Z

Fair enough: I will do that. Thanks :-)

@cary-ilm

Thanks to @cary-ilm and @peterhillman. Signed-off-by: Philippe Leprince <[email protected]> Fixed: I missed a couple of lines in the CI setup. Signed-off-by: Philippe Leprince <[email protected]> fix whitespace to minimize diff Signed-off-by: Philippe Leprince <[email protected]> fix whitespace to minimize diff Signed-off-by: Philippe Leprince <[email protected]> fix whitespace to minimize diff Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-02-20T13:04:32Z

@cary-ilm , @peterhillman, please review. Cheers

peterhillman

Looks good, just a couple of queries

src/lib/OpenEXR/ImfCompressor.cpp

src/lib/OpenEXR/ImfCompression.h

Signed-off-by: Philippe Leprince <[email protected]>

…rom ImfCompressor to ImfCompression. - simplify logic in isLossyCompression and isValidDeepCompression. - additional clarifications in comments / doc strings. Signed-off-by: Philippe Leprince <[email protected]>

pleprince · 2024-02-28T09:01:42Z

If / when this gets merged, I will update the Zstd PR. Cheers

cary-ilm · 2024-02-28T17:25:37Z

This looks good, I'll look it over in more detail in a bit but we'll merge it shortly, for the next minor release. Thanks!

It would be good to include a comment, maybe at the top of ImfCompression.h near the declaration of the enum, describing the various parts that need editing when adding a compression type.

cary-ilm · 2024-03-05T23:18:05Z

I'm going to merge this as is, we can follow up later with further changes as needed.

Stating the obvious, but this covers only the C++ API, not OpenEXRCore, which should be kept in sync.

pleprince added 4 commits January 19, 2024 12:37

cmake generates relevant files to integrate all compression methods.

851f42d

See CMakeLists.txt for details. Signed-off-by: Philippe Leprince <[email protected]>

update utilities

6ea5896

Signed-off-by: Philippe Leprince <[email protected]>

fix gitignore

616c443

Signed-off-by: Philippe Leprince <[email protected]>

sanity check for gaps in enum ids.

175efde

suggested by @peterhillman. Signed-off-by: Philippe Leprince <[email protected]>

pleprince force-pushed the Automate-compression-method-detection branch from 5e93f23 to 175efde Compare January 19, 2024 11:38

Use isValidCompressionId() instead of hard-coded test.

28e6d7a

Signed-off-by: Philippe Leprince <[email protected]>

generate ImfCRgbaFile.h

47dff96

Signed-off-by: Philippe Leprince <[email protected]>

pleprince added 5 commits January 22, 2024 11:24

move implementation to ImfCompression.cpp to hide CompressionDesc.

5e59339

Signed-off-by: Philippe Leprince <[email protected]>

convert IdToDesc to a static array.

a2bae86

Signed-off-by: Philippe Leprince <[email protected]>

ImfCompressor.h was not correctly removed from the list.

2757184

Signed-off-by: Philippe Leprince <[email protected]>

replace cmake_path() with string() if cmake version < 3.20.0. Flagged…

35b4661

… by @cary-ilm. Signed-off-by: Philippe Leprince <[email protected]>

Only generate the files with option UPDATE_COMPRESSOR_LIST=ON.

4e5c205

All files are generated during configuration, making them available for bazel. See comments at the top of src/lib/OpenEXR/CMakeLists.txt. Signed-off-by: Philippe Leprince <[email protected]>

peterhillman reviewed Jan 23, 2024

View reviewed changes

src/lib/OpenEXR/ImfCompression.h.in Outdated Show resolved Hide resolved

peterhillman reviewed Jan 23, 2024

View reviewed changes

src/lib/OpenEXR/ImfCRgbaFile.h Outdated Show resolved Hide resolved

pleprince added 2 commits January 23, 2024 13:16

Fix incorrect defines.

43eed95

spotted by @peterhillman Signed-off-by: Philippe Leprince <[email protected]>

Improve DO NOT EDIT messages.

ec4791f

suggested by @peterhillman. Signed-off-by: Philippe Leprince <[email protected]>

Add new testCompressionApi test.

85a86c9

Signed-off-by: Philippe Leprince <[email protected]>

pleprince added 2 commits January 23, 2024 14:45

Enable UPDATE_COMPRESSOR_LIST in ci_workflow.yml.

4b37dbf

Suggested by @peterhillman. Signed-off-by: Philippe Leprince <[email protected]>

Add missing ImfCompression.cpp to BUILD.bazel

9e1eb81

Signed-off-by: Philippe Leprince <[email protected]>

pleprince marked this pull request as ready for review January 24, 2024 14:53

Merge remote-tracking branch 'upstream/main' into Automate-compressio…

b15ef90

…n-method-detection

peterhillman mentioned this pull request Feb 16, 2024

CompressionAttribute::readValueFrom() should use isValidCompression()? #1641

Closed

pleprince force-pushed the Automate-compression-method-detection branch from d961e74 to bd763ef Compare February 20, 2024 13:01

pleprince requested a review from peterhillman February 20, 2024 13:03

peterhillman approved these changes Feb 26, 2024

View reviewed changes

src/lib/OpenEXR/ImfCompressor.cpp Outdated Show resolved Hide resolved

src/lib/OpenEXR/ImfCompression.h Show resolved Hide resolved

pleprince added 2 commits February 26, 2024 17:11

Restore comments describing the codecs.

87fd8f6

Signed-off-by: Philippe Leprince <[email protected]>

pleprince requested a review from peterhillman February 28, 2024 09:00

cary-ilm merged commit 60898ad into AcademySoftwareFoundation:main Mar 5, 2024
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate compression method detection #1616

Automate compression method detection #1616

pleprince commented Jan 18, 2024

pleprince commented Jan 18, 2024

peterhillman commented Jan 18, 2024

pleprince commented Jan 19, 2024

pleprince commented Jan 19, 2024

pleprince commented Jan 19, 2024

peterhillman commented Jan 20, 2024

Vertexwahn commented Jan 21, 2024

cary-ilm commented Jan 22, 2024

pleprince commented Jan 22, 2024 •

edited

Loading

pleprince commented Jan 22, 2024

pleprince commented Jan 22, 2024

cary-ilm commented Jan 22, 2024

pleprince commented Jan 22, 2024

cary-ilm commented Jan 22, 2024

pleprince commented Jan 23, 2024

pleprince commented Jan 23, 2024

peterhillman commented Jan 23, 2024

cary-ilm commented Jan 23, 2024

pleprince commented Jan 24, 2024

pleprince commented Jan 25, 2024

pleprince commented Feb 14, 2024

cary-ilm commented Feb 14, 2024

peterhillman commented Feb 14, 2024

pleprince commented Feb 15, 2024

pleprince commented Feb 20, 2024

peterhillman left a comment

pleprince commented Feb 28, 2024 •

edited

Loading

cary-ilm commented Feb 28, 2024

cary-ilm commented Mar 5, 2024

Automate compression method detection #1616

Automate compression method detection #1616

Conversation

pleprince commented Jan 18, 2024

pleprince commented Jan 18, 2024

peterhillman commented Jan 18, 2024

pleprince commented Jan 19, 2024

pleprince commented Jan 19, 2024

pleprince commented Jan 19, 2024

peterhillman commented Jan 20, 2024

Vertexwahn commented Jan 21, 2024

cary-ilm commented Jan 22, 2024

pleprince commented Jan 22, 2024 • edited Loading

pleprince commented Jan 22, 2024

pleprince commented Jan 22, 2024

cary-ilm commented Jan 22, 2024

pleprince commented Jan 22, 2024

cary-ilm commented Jan 22, 2024

pleprince commented Jan 23, 2024

pleprince commented Jan 23, 2024

peterhillman commented Jan 23, 2024

cary-ilm commented Jan 23, 2024

pleprince commented Jan 24, 2024

pleprince commented Jan 25, 2024

pleprince commented Feb 14, 2024

cary-ilm commented Feb 14, 2024

peterhillman commented Feb 14, 2024

pleprince commented Feb 15, 2024

pleprince commented Feb 20, 2024

peterhillman left a comment

Choose a reason for hiding this comment

pleprince commented Feb 28, 2024 • edited Loading

cary-ilm commented Feb 28, 2024

cary-ilm commented Mar 5, 2024

pleprince commented Jan 22, 2024 •

edited

Loading

pleprince commented Feb 28, 2024 •

edited

Loading