Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull Request to integrate read only HEIF support in Exiv2 using libheif #1044

Closed
wants to merge 12 commits into from

Conversation

cgilles
Copy link
Collaborator

@cgilles cgilles commented Oct 16, 2019

See my progress described in bug #318

Best

Gilles Caulier

@phako
Copy link
Contributor

phako commented Oct 16, 2019

Why are you including libheif in this?

@cgilles
Copy link
Collaborator Author

cgilles commented Oct 16, 2019

See my comments in #318...

@codecov
Copy link

codecov bot commented Oct 16, 2019

Codecov Report

Merging #1044 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1044      +/-   ##
==========================================
- Coverage   71.24%   71.24%   -0.01%     
==========================================
  Files         148      148              
  Lines       19457    19456       -1     
==========================================
- Hits        13862    13861       -1     
  Misses       5595     5595
Impacted Files Coverage Δ
include/exiv2/error.hpp 90.9% <ø> (ø) ⬆️
src/error.cpp 90.24% <ø> (ø) ⬆️
src/image.cpp 84.28% <ø> (ø) ⬆️
src/params.cpp 73.55% <0%> (-0.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e863d2...591bbb2. Read the comment docs.

Copy link
Collaborator

@piponazo piponazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

In general it looks good to me, but as I mentioned before, I do not like the idea of adding source code to an external library inside our repository. The main reason for that is that we could not upgrade easily to more recent versions of the library by just requiring a more recent version of it.

I will work during the next days in a conan recipe for this project so that we can consume the library easily. I expect to come back with some feedback, at the latest, this Sunday.

@@ -10,6 +10,7 @@ include(CMakePackageConfigHelpers)

include_directories(${CMAKE_CURRENT_BINARY_DIR})

include(libheif/LibHeifRules.cmake)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate the contribution but I am against including the libheif source code into the Exiv2 repository.

Since we have a good support for handling external dependencies via conan for all the platforms and compilers, I propose to create a conan package for libheif that could be consumed by Exiv2 or other projects. We already have an old version of XMP inside our repository and we offer the opportunity of updating to newer version of XMP via conan.

This is going to be a rainy weekend in my city, so I offer my help to take care of that 😸

// *****************************************************************************

// included header files
#include "image_int.hpp"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other image headers such as jp2image.hpp include the header image.hpp where the class Image is defined. When opening this file in an IDE, I get lot of errors from the clang static analyser regarding files not found.

Furthermore, It looks like this file should be named heifimage.hpp without the _intsuffix , and be moved to the folder include/exiv2 in order to be deployed together with the other supported file format headers.

throw(Error(kerInvalidSettingForImage, "Image comment", "HEIF"));
} // HeifImage::setComment

void HeifImage::readMetadata()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] This function has almost 200 LoC. Would it be possible to extract some blocks of code into separated private methods or functions?

CHECK_INCLUDE_FILE(unistd.h HAVE_UNISTD_H)

if(HAVE_INTTYPES_H)
add_definitions(-DHAVE_INTTYPES_H)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned previously I am against adding libheif directly in our repository. Another argument for that is that this cmake file is adding definitions and include directories globally to the project (instead of doing so to a specific target or source files).

@@ -26,8 +26,6 @@
*/
#pragma once

#include <libheif/heif.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@D4N D4N left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am definitely against bundling yet another library into exiv2's code base: we already bundle the xmpsdk and an INI parser. Also, if we bundle libheif, then we'll make packaging exiv2 extremely painful, as libheif is considered a no-go for many Linux distributions (you won't find it in Fedora for legal reasons and they got a dedicated legal team...). By bundling it, packagers will have to manually remove libheif, which is not a nice thing to do from our side.

Other general comments:

  • commit f46967d is already on master, no need to include it here
  • please drop the merge commits and rebase+squash the polish commits
  • please restore the original history from @1div0's branch, so that his edits are visible
  • there is a lot of changes in po/nl.po that look completely unrelated to this PR, please consider to resubmit them in a different PR or drop them
  • no tests at all

// Read Exif chunk.

size_t data_size = heif_image_handle_get_metadata_size(handle, dataIds[i]);
uint8_t* const data = (uint8_t*) alloca(data_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please never ever use alloca, not even in plain C projects.


}

heif_image_handle_release(handle);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So unless this function returns cleanly, we get a leak?


} // HeifImage::readMetadata

void HeifImage::printStructure(std::ostream& /*out*/, PrintStructureOption /*option*/, int /*depth*/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of this function beside just throwing exceptions? If that's the point, then please rename it.


} // HeifImage::writeMetadata

void HeifImage::doWriteMetadata(BasicIo& outIo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] Do I understand it correctly that all this function does is write the HEIF header?

#ifdef EXIV2_DEBUG_MESSAGES
std::cout << "Exiv2::HeifImage::isHeifType() = " << matched << std::endl;
#endif
if (!advance || !matched)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if advance is true and there is a match, you'd still seek back. Is that intended?

(data[2] << 8) |
data[3]) + 4;

if (data_size > (size_t)skip)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (data_size > (size_t)skip)
if (data_size > static_cast<size_t>(skip))

int skip = ((data[0] << 24) |
(data[1] << 16) |
(data[2] << 8) |
data[3]) + 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addition can overflow. Also, please add a sanity check for skip, this value can be anything.

std::cerr << "Exiv2::HeifImage:: HEIF exif container found with size:" << data_size - skip << std::endl;
#endif

// hexdump (std::cerr, data, data_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// hexdump (std::cerr, data, data_size);

Comment on lines +215 to +257
if (
(std::string(heif_image_handle_get_metadata_type(handle, dataIds[i])) == std::string("mime")) &&
(std::string(heif_image_handle_get_metadata_content_type(handle, dataIds[i])) == std::string("application/rdf+xml"))
)
{
// Read Xmp chunk.

size_t data_size = heif_image_handle_get_metadata_size(handle, dataIds[i]);
uint8_t* const data = (uint8_t*) alloca(data_size);
err = heif_image_handle_get_metadata(handle, dataIds[i], data);

if (err.code)
{
#ifdef EXIV2_DEBUG_MESSAGES
std::cerr << "Exiv2::HeifImage::readMetadata: " << err.message << std::endl;
#endif
throw Error(kerFailedToReadImageData);
}

#ifdef EXIV2_DEBUG_MESSAGES
std::cerr << "Exiv2::HeifImage:: HEIF Xmp container found with size:" << data_size << std::endl;
#endif

xmpPacket_.assign(reinterpret_cast<char *>(data), data_size);
std::string::size_type idx = xmpPacket_.find_first_of('<');

if (idx != std::string::npos && idx > 0)
{
#ifndef SUPPRESS_WARNINGS
EXV_WARNING << "Removing " << static_cast<uint32_t>(idx)
<< " characters from the beginning of the Xmp packet" << std::endl;
#endif
xmpPacket_ = xmpPacket_.substr(idx);
}

if (xmpPacket_.size() > 0 && XmpParser::decode(xmpData_, xmpPacket_))
{
#ifndef SUPPRESS_WARNINGS
EXV_WARNING << "Failed to decode Xmp metadata." << std::endl;
#endif
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[suggestion] This section bears quite some similarities to the previous and the following one. I'd consider refactoring it to remove the duplication.

@piponazo
Copy link
Collaborator

First piece of work in progress to improve the libheif CMake code, so that we can generate a clean conan recipe for it: strukturag/libheif#168

@piponazo
Copy link
Collaborator

The first versions of the conan package are available here:
https://bintray.com/beta/#/piponazo/piponazo/libheif:piponazo?tab=overview

Please, let me know if you want to give it a try by yourself or if you need my help to incorporate these changes.

@D4N
Copy link
Member

D4N commented Dec 10, 2019

@cgilles Do you plan to work on this further?

@piponazo
Copy link
Collaborator

@cgilles Do you plan to work on this further?

We are kind of blocked here until the changes I proposed in libheif are merged (It seems that the maintainers of the project have been a bit idle the last weeks):
strukturag/libheif#168

Afterwards, I will need to revisit the conan package I wrote for that.

@cryptomilk
Copy link
Collaborator

FYI: I've fixed libheif and 1.6.1 works correctly, however HEIF is a patent minefield.

@clanmills
Copy link
Collaborator

HEIF may be a patent minefield however that doesn't mean that reading the metadata from HEIF involves patents.

I believe the file-format is ISOBMFF (ISO Base Media File Format ISO/IEC 14496-12). Reading ISOBMFF cannot involve patent infringement. Decoding the embedded data blocks could involve patents, however metadata blocks (Exif, IPTC, ICC and XMP) are also public standards which do not involve patents.

I'm unclear about the libraries being linked by Exiv2 to undertake reading the file. Providing support to use another library cannot be a patent infringement by Exiv2. The dependent library may infringe patents (for example, to decode compressed data), however providing support in Exiv2 to link that library is not a patent infringement.

@cryptomilk
Copy link
Collaborator

Using libheif is fine, but none of the Linux distributions will package it and thus exiv2 wont have support for HEIF files. Which isn't that bad, as image viewer wont have support either :-)

I'm pushing for AVIF as this is much more likely the next image codec to be widely used. Even Apple joined AOMedia.

@cgilles
Copy link
Collaborator Author

cgilles commented Feb 6, 2020

Linux Mageia7 : libheif is present de facto...

@D4N
Copy link
Member

D4N commented Feb 6, 2020

I'm unclear about the libraries being linked by Exiv2 to undertake reading the file. Providing support to use another library cannot be a patent infringement by Exiv2. The dependent library may infringe patents (for example, to decode compressed data), however providing support in Exiv2 to link that library is not a patent infringement.

I'd love if it were like that, but unfortunately law is not that simple :( I've been told that if you even modify your own library to link against a library that infringes patents, you're putting yourself at risk. So while libheif claims that they don't infringe patents, that hasn't been verified by a lawyer and by supporting linking against it, we risk getting removed from Linux distros that are more restrictive with respect to licensing (Fedora, Debian, openSUSE, etc.)

@clanmills
Copy link
Collaborator

You're right. Exiv2 could be booted out the distros and we don't want that to happen. If we had our own code to read ISOBMFF we would avoid HEIF patent issues. It's possible we already have code for another format (eg JP2) that only needs a little tweak.

@boardhead
Copy link
Collaborator

A small addition to an MP4 parser gives you HEIF support

@cryptomilk
Copy link
Collaborator

A general solution would be better as it can be use with AVIF too. So if you have a MP4 parser already and could extend it, then you can support HEIF and AVIF without linking in more libraries.

@clanmills
Copy link
Collaborator

When I investigated CR3 (about 2 years ago), I discovered a GitHub repos for isobmffdump by pyke359. His code is short and to the point. https://github.com/pyke369/isobmffdump

I invited him to join Team Exiv2 with the following reply: pyke369/isobmffdump#1 (comment)

675 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ wc *.cpp
      83     250    2194 dumper.cpp
     260    1006    8054 isobmffdump.cpp
      87     224    2158 tiffs.cpp
     430    1480   12406 total
676 rmills@rmillsmbp:~/gnu/github/isobmff/pyke369/isobmffdump $ 

Here it's in action on some HEIF sample files:

667 rmills@rmillsmbp:~/gnu/github/isobmff/DigiDNA/ISOBMFF/Example-Files $ ../../../pyke369/isobmffdump/isobmffdump IMG1.HEIC 
@0         | ftyp [24]
@24        | meta [3922]
@3946      | mdat [1462352]
@1466298   | end
668 rmills@rmillsmbp:~/gnu/github/isobmff/DigiDNA/ISOBMFF/Example-Files $ ../../../pyke369/isobmffdump/isobmffdump IMG2.HEIC 
@0         | ftyp [24]
@24        | meta [4350]
@4374      | mdat [848580]
@852954    | end
669 rmills@rmillsmbp:~/gnu/github/isobmff/DigiDNA/ISOBMFF/Example-Files $

I believe the Exif, XMP, ICC and IPTC data blocks are stored in the mdat box/atom.
https://forums.developer.apple.com/thread/89759

The file IMG1.HEIC has Exif metadata (including GPS data) which I can see with Preview.app on the Mac. The GPS data is Lausanne, Switzerland. Curious, eh, @piponazo ?

I can see the Exif data with my file dump utility:

  0x61f0    25072: .....Exif..MM.*.  ->  f0 00 00 00 06 45 78 69 66 00 00 4d 4d 00 2a 00
  0x6200    25088: ................  ->  00 00 08 00 0b 01 0f 00 02 00 00 00 06 00 00 00
  0x6210    25104: ................  ->  92 01 10 00 02 00 00 00 0e 00 00 00 98 01 12 00
  0x6220    25120: ................  ->  03 00 00 00 01 00 06 00 00 01 1a 00 05 00 00 00
  0x6230    25136: ................  ->  01 00 00 00 a6 01 1b 00 05 00 00 00 01 00 00 00

So it's buried in the mdat that starts at 4374 (of length 848580).

I don't think we're far off being able to extract the Exif data and parse it in Exiv2. I'm stuck in a hotel in Sydney, Australia and it's raining hard outside (60mm today). Maybe I'll investigate this while I'm stuck.

@1div0
Copy link
Collaborator

1div0 commented Feb 7, 2020

@clanmills nJoy your trip. Flown by A380?

@clanmills
Copy link
Collaborator

Singapore Boeing 777 LHR->SIN->SYD. Will visit Tuan and family in Singapore on the return. Tuan was a GSoC Student with Exiv2 in 2013. We may have to isolate ourselves when we get home because of this virus.

@clanmills
Copy link
Collaborator

@cgilles We will do further work on this PR and will adopt a new approach to HEIF. #1066

The Exiv2 v0.27.3 is on track to ship on 2020-06-30. Exiv2 v0.27.3 RC1 will ship on 2020-04-30 and is expected to include support for HEIF.
#1018 (comment)

@clanmills
Copy link
Collaborator

Topic: Exiv2 and ISOBMFF Support
Time: Jun 14, 2020 13:00 London

Join Zoom Meeting
https://us02web.zoom.us/j/82136730279?pwd=M3hCbll4cWN3ellJd2pCZkxjVEx3Zz09

Meeting ID: 821 3673 0279
Password: 1fDNUV

@piponazo
Copy link
Collaborator

piponazo commented Jun 9, 2020

Ey @clanmills , did you organise that meeting with more people? Honestly I did not follow up the discussion about the topic but I would like to hear more about it. If I am not riding my bike that day I will try to show up in the meeting 😉

@clanmills
Copy link
Collaborator

Thanks @piponazo. The meeting is open to anybody who shows up. I've put the invitation on every issue involving ISOBMFF and HEIF.

I decided to cancel #1066 in response to criticism from both @D4N and @phako. The subject has not gone away. The only feedback I've had about Exiv2 v0.27.3 RCs are dozens of emails every day asking "what is the legal problem with reading ISOBMFF?".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants