Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readMetadata - improve warning & error handling #1405

Open
moon6969 opened this issue Nov 23, 2020 · 5 comments
Open

readMetadata - improve warning & error handling #1405

moon6969 opened this issue Nov 23, 2020 · 5 comments
Labels
refactoring Cleanup / Simplify code -> more readable / robust
Milestone

Comments

@moon6969
Copy link

moon6969 commented Nov 23, 2020

I propose readMetadata (or alternative) should:

  • Continue on error: return as much valid data as possible
  • return list of warnings: have a better framework for reporting warnings than stderr

Current implementation of ReadMetadata

(NB: I'm new to the mysteries of metadata and my C++ is quite rusty)
readMetadata is implemented per-image format (eg: in jpgimage.cpp, pngimage.cpp, etc.)
In general readMetadata will loop through the image metadata sections and call decoders based on the 3 main metadata types:
ExifParser::decode
IptcParser::decode
XmpParser::decode

Currently it's inconsistent - if errors occur during while reading metadata, the data might be

  • silently lost
  • lost with a stderr warning (#ifndef SUPPRESS_WARNINGS)
  • an exception raised

Once there is an error, often the other valid metadata in that image is not returned.

Continue on error requirement

  • Needs to be handled at multiple "stages" of the reading process:
    • Looping the metadata blobs - eg: failure to read XMP data should not prevent return of Exif data.
    • Looping the groups within a blob - eg: IPTC records, XMP namespaces
    • Looping the individual tags within a group
  • Should be consistent across all image formats and metadata types
  • if the error prevents further reading of that stage, at least return the valid data already gathered

Warnings requirement

  • part of the API rather than build time (#ifndef SUPPRESS_WARNINGS)
  • Not strings to stderr - perhaps an array of warning objects?
  • Warnings should indicate the stage/level etc. so calling app can figure out what was skipped or not
    EG: An invalid xmp namespace warning implies that namespace is skipped, but other XMP namespaces are OK (unless they also have a warning!)

Reason for proposal

I am writing a tool to re-format some collections of images while preserving as much metadata as possible.
I believe following specifications accurately is important, so want to ensure any images I save are compliant, but I was also inspired by this note from https://dev.exiv2.org/issues/1154#note-4 ...
My reasoning for being tolerant of data errors is that exiv2 is not a police force with the duty to enforce standards. Exiv2 exists to help users get at their metadata, whether it complies with the specs or not.

@clanmills
Copy link
Collaborator

clanmills commented Nov 23, 2020

@moon6969 I admire your efforts to contribute to Exiv2. readMetadata() is the "beating heart" of Exiv2. I don't know that this is a suitable place to apply your "rusty" C++ skills.

None-the-less, your proposal is basically sound.

There is another significant matter related to your proposal. It's been a long standing ambition of the project to have a "unified metadata container". Currently we have three containers for Exif, XMP and IPTC. There is an implementation of the "unified metadata container" in the unstable branch on SVN. If we're going to perform brain surgery on readMetadata(), I'd like to see the "unified metadata container" included in that effort.

Another subject is who is going to undertake this effort? The team of active contributors is small. While a lot of work on 'master' has been done by Dan, Luis and Rosen, I'm not sure there is a plan for what will (and what will not) be in Exiv2 v0.28 when it eventually ships.

I maintain the 0.27-maintenance branch and will not be implementing your proposal in that branch.

I want to see Exiv2 survive after I retire in January 2021. I will finish my book "Image Metadata and Exiv2 Architecture" by the end of 2020. The purpose of the book is to document everything I know about both Metadata and Exiv2 and I hope the effort will be rewarded by seeing proposals such as yours come to life as Exiv2 thrives in years to come.

@moon6969
Copy link
Author

I don't know that this is the best of place to apply your "rusty" C++ skills.

Totally agree! I don't have the skill-set or indeed time for such an undertaking, but can contribute to the discussion from my viewpoint as an API consumer.

@clanmills
Copy link
Collaborator

Recruiting contributors to an open-source project is not easy. Many folks make suggestions about what I should do for them. Recruiting Engineers to work on the code, is more difficult.

dt941217dhc0

I discuss contributors in the book: https://clanmills.com/exiv2/book/#11-17

I want to acknowledge the hard work and effort by Andreas, Luis, Dan, Ben, Brad and Neils who have done a lot of great work on Exiv2.

@clanmills
Copy link
Collaborator

@moon6969 I hope I haven't insulted you with the Dilbert Cartoon. Good ideas are ... good ideas. Implementation is something else. The challenge of insufficient engineering resources is immense.

When I joined Adobe, their tag line was "Inspiration becomes Reality". In my mind, I changed it to "Inspiration becomes Perspiration!".

@clanmills clanmills added this to the v1.00 milestone Apr 12, 2021
@clanmills
Copy link
Collaborator

I'm going to assign this to v1.00. One of the top priority matters for v1.00 is the Unified Metadata Container. It has been studied: #1505

Although I'm assigning this to Milestone: v1.00 as commitment to give this thought. We'll have to see what time contributors have available to actually work on this matter.

@clanmills clanmills added the refactoring Cleanup / Simplify code -> more readable / robust label Apr 13, 2021
@kevinbackhouse kevinbackhouse modified the milestones: v0.28.0, Backlog Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Cleanup / Simplify code -> more readable / robust
Projects
None yet
Development

No branches or pull requests

3 participants