Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem setting repeatable/non-repeatable Iptc tags #1395

Closed
postscript-dev opened this issue Nov 17, 2020 · 10 comments
Closed

Problem setting repeatable/non-repeatable Iptc tags #1395

postscript-dev opened this issue Nov 17, 2020 · 10 comments
Assignees
Labels
support Anything related to a user needing help with exiv2
Milestone

Comments

@postscript-dev
Copy link
Collaborator

Describe the bug
Iptc provides both repeatable and non-repeatable tags to record values and this presents a problem when trying to make a general method for updating a list of tags.

Programmers using exiv2 may be unaware of this issue and some data could be silently lost.

To Reproduce
This is observed in the master branch but expected in the maintenance one also.

  • Repeatable tag:
// No Iptc data set
iptcData["Iptc.Application2.Keywords"] = "Keyword 1";
iptcData["Iptc.Application2.Keywords"] = "Keyword 2";     // Overrides "Keyword 1" value

This will set Iptc.Application2.Keywords = "Keyword 2" (expected "Keyword 1" and "Keyword 2" to be set?) and treats the repeatable tag as single tag. If Iptc.Application2.Keywords already has multiple entries, then the first one is overwritten with "Keyword 2" while the rest remain untouched. To create the correct effect instead you would need

// No Iptc data set
Exiv2::IptcKey iptc_key("Iptc.Application2.Keywords");
auto iptc_value = Exiv2::Value::create(Exiv2::string);

iptc_value->read("Keyword 1");
iptcData.add(iptc_key, iptc_value.get());
iptc_value->read("Keyword 2");
iptcData.add(iptc_key, iptc_value.get());
  • Single Tag:
// No Iptc data set
iptcData["Iptc.Application2.EditStatus"] = "Status 1";
iptcData["Iptc.Application2.EditStatus"] = "Status 2";

will set Iptc.Application2.EditStatus = "Status 2". This is correct. However when using add()

// No Iptc data set
Exiv2::IptcKey iptc_key("Iptc.Application2.EditStatus");
auto iptc_value = Exiv2::Value::create(Exiv2::string);

iptc_value->read("Status 1");
iptcData.add(iptc_key, iptc_value.get());
iptc_value->read("Status 2");
iptcData.add(iptc_key, iptc_value.get());        // Fail, returns non-zero

This will result in Iptc.Application2.EditStatus = "Status 1" as you can only add one value to an unset single tag.

This means that you cannot use = or add() to update all tags as both fail in their own circumstance.

Expected behavior

  • Both = and add() methods to be used to update Iptc values irrespect of whether a multiple or single tag.
  • If the current behaviour is already correct, then a change in the documentation and examples to reflect that.

Desktop (please complete the following information):

  • OS: Windows (expected to be all OS)
  • Compiler & Version: gcc.exe (Rev5, Built by MSYS2 project) 10.2.0

Additional context

@clanmills
Copy link
Collaborator

clanmills commented Nov 17, 2020

This is interesting. The tag definition is src/datasets.cpp has a "repeatable" field.

    DataSet::DataSet(
        uint16_t number,
        const char* name,
        const char* title,
        const char* desc,
        bool mandatory,
        bool repeatable,
        uint32_t minbytes,
        uint32_t maxbytes,
        TypeId type,
        uint16_t recordId,
        const char* photoshop
    )
        : number_(number), name_(name), title_(title), desc_(desc),
          mandatory_(mandatory), repeatable_(repeatable), minbytes_(minbytes),
          maxbytes_(maxbytes), type_(type), recordId_(recordId),
          photoshop_(photoshop)
    {
    }

It's documented on the website under R. https://exiv2.org/iptc.html

screenshot_33

Can we summarise the desired behaviour to be:

  1. Assignment will assign a single value (whether it is repeatable or not)
    iptcData["Iptc.Application2.EditStatus"] = "Status 1";

  2. add() will assign if the key is not in use and add an additional value if repeatable is set.

@clanmills
Copy link
Collaborator

clanmills commented Nov 17, 2020

The API for DataSet is very limited:

632 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ finder "*.hpp" | xargs grep DataSet
./include/exiv2/iptc.hpp:                        uint16_t record = IptcDataSets::application2);
./include/exiv2/iptc.hpp:                              uint16_t record = IptcDataSets::application2) const;
./include/exiv2/datasets.hpp:    struct EXIV2API DataSet {
./include/exiv2/datasets.hpp:        DataSet(
./include/exiv2/datasets.hpp:    }; // struct DataSet
./include/exiv2/datasets.hpp:    class EXIV2API IptcDataSets {
./include/exiv2/datasets.hpp:        IptcDataSets() {}
./include/exiv2/datasets.hpp:        IptcDataSets(const IptcDataSets& rhs);
./include/exiv2/datasets.hpp:        IptcDataSets& operator=(const IptcDataSets& rhs);
./include/exiv2/datasets.hpp:        static const DataSet* envelopeRecordList();
./include/exiv2/datasets.hpp:        static const DataSet* application2RecordList();
./include/exiv2/datasets.hpp:        static const DataSet* records_[];
./include/exiv2/datasets.hpp:    }; // class IptcDataSets
./include/exiv2/datasets.hpp:    EXIV2API std::ostream& operator<<(std::ostream& os, const DataSet& dataSet);
633 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance $ 

To know if a key is repeatable, you have to find the data set and examine repeatable_. Here are the relevant APIs:

bool IptcDataSets::dataSetRepeatable(uint16_t number, uint16_t recordId)
uint16_t IptcDataSets::recordId(const std::string& recordName);
uint16_t record = IptcDataSets::application2;
const DataSet* IptcDataSets::envelopeRecordList()
const DataSet* IptcDataSets::application2RecordList()

@postscript-dev
Copy link
Collaborator Author

postscript-dev commented Nov 27, 2020

I'm sorry for not replying @clanmills, I'm still new to metadata and thought that you wouldn't need my input. I had few thoughts at the time and have looked into this more tonight after your message.

  • I think that the use of operator=() and add() should change and probably work in a similar way to setting values for an Xmp structure. This will help to make a consistant interface althought there are still differences between Xmp's and Iptc's handling of multiple values - namely Xmp structure Vs duplicate Iptc tag entries. Grouping the same Iptc repeatable tags together under a single 'Iptc structure' would be one possible solution. From the user's point of view this would simplify things and apart from backwards compatibility, I can't think of a reason to keep the duplicate Iptc tag entries.

  • I agree with you that it would be useful to have a simple way of finding out if an Iptc tag can be repeatable.

  • Regarding a general use of add(), I found something troubling in the documentation. ExifData::add() states that "No duplicate checks are performed". I am unsure about the use of duplicates in Exif metadata so don't know if this is acceptable or not. Also ExifData:add() has a return type of void but the equivalent XmpData::add() and IptcData::add() return an int error code. Lastly, Exiv2::ExifData in the documentation only contains a reference to one of the add() functions. Xmp and Iptc equivalents contain two, maybe Exif should also?

  • I have just experimented using ExifData::add() and XmpData::add() and I managed to add 2 identically named tags to the respective metadata. There isn't any checking for either function and this is something that I think will need to be addressed - the XmpData::add() documentation doesn't mention duplicates . I also tried adding 2 non-repeatable tags to IptcData::add() and this did check and prevent duplicates - the IptcData::add() documentation does explain this.

This is a lot of infomation in one go and I hope it is of some use. I'll have another look at this more tommorrow.

@clanmills clanmills removed this from the v0.27.4 milestone Nov 28, 2020
@clanmills
Copy link
Collaborator

Paul @postscript-dev Thank You very much for this very insightful analysis. I didn't write Exiv2, although I've been working on it since 2008. I've never thought about the metadata add() functions because nobody has ever brought up this topic. The usual way to modify metadata is something like this (from samples/addmoddel.cpp)

    // This is the quickest way to add (simple) Exif data. If a metadatum for
    // a given key already exists, its value is overwritten. Otherwise a new
    // tag is added.
    exifData["Exif.Image.Model"] = "Test 1";                     // AsciiValue
    exifData["Exif.Image.SamplesPerPixel"] = uint16_t(162);      // UShortValue
    exifData["Exif.Image.XResolution"] = -2;            // LongValue
    exifData["Exif.Image.YResolution"] = Exiv2::Rational(-2, 3); // RationalValue

As you have observed, add() is a weapon that must be treated with caution.

Here's the project status in the RoadMap: #1018 (comment) I've removed the 0.27.4 milestone from this issue as I think this is going to be a considerable undertaking for the 'master' branch and not something for a "dot" release.

@postscript-dev
Copy link
Collaborator Author

To know if a key is repeatable, you have to find the data set and examine repeatable_.

I didn't know about bool IptcDataSets::dataSetRepeatable(uint16_t number, uint16_t recordId) as I haven't used that area of the code before.

I've removed the 0.27.4 milestone from this issue as I think this is going to be a considerable undertaking for the 'master' branch and not something for a "dot" release.

If grouping the repeatable tags into a structure is a good idea, then it would involve a much bigger change.

Can we summarise the desired behaviour to be:

  1. Assignment will assign a single value (whether it is repeatable or not)
    iptcData["Iptc.Application2.EditStatus"] = "Status 1";
  2. add() will assign if the key is not in use and add an additional value if repeatable is set.

Maybe I have misunderstood, but I think this is what currently happens.

I mentioned several things last night and some are still possible to implement.

  • The checks for duplicates in ExifData::add() and XmpData::add() could still be included as this change prevents the user from causing problems.
  • I don't think that changing the return value for ExifData::add() to an int (to provide an error code) will cause problems.
  • Adding references to the missing second ExifData::add() function at the bottom of the ExifData Member Function Documentation is a small cosmetic change.

@clanmills
Copy link
Collaborator

You're right. We could return an error code from the add() functions:

989 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/build $ nm -g --demangle lib/libexiv2.dylib | grep Exiv2::.*Data::add
0000000000147530 T Exiv2::XmpData::add(Exiv2::XmpKey const&, Exiv2::Value const*)
00000000001474e0 T Exiv2::XmpData::add(Exiv2::Xmpdatum const&)
000000000005ba00 T Exiv2::ExifData::add(Exiv2::ExifKey const&, Exiv2::Value const*)
000000000005b9a0 T Exiv2::ExifData::add(Exiv2::Exifdatum const&)
0000000000097a60 T Exiv2::IptcData::add(Exiv2::IptcKey const&, Exiv2::Value*)
0000000000097750 T Exiv2::IptcData::add(Exiv2::Iptcdatum const&)
990 rmills@rmillsmbp:~/gnu/github/exiv2/0.27-maintenance/build $ 

I'll look again at this next week and give you an update. Thank You for your thoughtful insights into these matters.

@clanmills clanmills added support Anything related to a user needing help with exiv2 and removed bug labels Nov 29, 2020
@clanmills clanmills added this to the v0.27.4 milestone Nov 29, 2020
@clanmills
Copy link
Collaborator

I will investigate this matter this week as I think there are useful discoveries to made that I can document in my book. So, I've assigned this the milestone v0.27.4 to keep the issue visible in my book-mark.

However, all I will do is to investigate and document this issue. I will not be making changes to Exiv2 as a consequence of my investigation.

At a team meeting in May 2018, it was agreed that the team would I would release v0.27 and a series of "dot" releases while the others would modernise the code to C++17. I've done as agreed. I want to finish the book by the end of 2020 and retire on my 70th birthday on 2021-01-18.

@postscript-dev
Copy link
Collaborator Author

Can we summarise the desired behaviour to be:

  1. Assignment will assign a single value (whether it is repeatable or not)
    iptcData["Iptc.Application2.EditStatus"] = "Status 1";
  2. add() will assign if the key is not in use and add an additional value if repeatable is set.

Maybe I have misunderstood, but I think this is what currently happens.

I was thinking about this in the morning and realised that I had been using the TypeId to determin if a tag was multiple - which doesn't work for Iptc tags. This is why I couldn't work out which method to use in the FP. Now I know about bool IptcDataSets::dataSetRepeatable(uint16_t number, uint16_t recordId) the FP is redundant and your summary above does still work. Thank you for your patience @clanmills you have been very helpful. I hope the issues that were found along the way make this not entirely wasted time and sorry for being slow to realise.

I thought of looking into the ExifData::add() and XmpData::add() check problem mentioned above, but I can't commit any time before Christmas. I can maybe reply to some posts but that's about it. If someone else would like to do this instead then that is fine with me.

@clanmills
Copy link
Collaborator

clanmills commented Nov 29, 2020

It's never a waste of time to investigate puzzles. I didn't write Exiv2, so there are still many mysteries. It's remarkable how seldom IPTC is discussed. I will investigate the matters we have been discussing and add something to the book.

However, after 12+ years, 5000+ contributions, about 1000 issues and PRs, 15,000 hours, 6 or 7 releases that I managed alone, and a 200 page book, I've reached the end of my metadata journey and will retire. The scope of the project is considerable and there's a never ending stream of puzzles and challenges. It's been mostly fun.

@clanmills
Copy link
Collaborator

I've investigated this and added the following text to my book:

IPTC Support in Exiv2

Exif provides support for Section 1. "Envelope" and Section 2. "Application". Tags are report as follows:

518 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ exiv2 -pi ~/Stonehenge.jpg 
Iptc.Envelope.ModelVersion                   Short       1  4
Iptc.Envelope.CharacterSet                   String      3  G
Iptc.Application2.RecordVersion              Short       1  4
Iptc.Application2.Caption                    String     12  Classic View
519 rmills@rmillsmm-local:~/gnu/exiv2/team/book/build $ 

You can list the available tags with samples/taglist.cpp

$ taglist Iptc | csv -
[ModelVersion]	[ 0]	[ 0x0000]	[ Envelope]	[ true]	[ false]	[ 2]	[ 2]	[ Iptc.Envelope.ModelVersion]	[ Short]	[ A binary ...]	
[Destination]	[ 5]	[ 0x0005]	[ Envelope]	[ false]	[ true]	[ 0]	[ 1024]	[ Iptc.Envelope.Destination]	[ String]	[ This DataSet ...]	
[FileFormat]	[ 20]	[ 0x0014]	[ Envelope]	[ true]	[ false]	[ 2]	[ 2]	[ Iptc.Envelope.FileFormat]	[ Short]	[ A binary number ...]	
[FileVersion]	[ 22]	[ 0x0016]	[ Envelope]	[ true]	[ false]	[ 2]	[ 2]	[ Iptc.Envelope.FileVersion]	[ Short]	[ A binary number ...]	
...
[RecordVersion]	[ 0]	[ 0x0000]	[ Application2]	[ true]	[ false]	[ 2]	[ 2]	[ Iptc.Application2.RecordVersion]	[ Short]	[ A binary number ...]	
[ObjectType]	[ 3]	[ 0x0003]	[ Application2]	[ false]	[ false]	[ 3]	[ 67]	[ Iptc.Application2.ObjectType]	[ String]	[ The Object Type is ...]	
[ObjectAttribute]	[ 4]	[ 0x0004]	[ Application2]	[ false]	[ true]	[ 4]	[ 68]	[ Iptc.Application2.ObjectAttribute]	[ String]	[ The Object ...]	
...
[Preview]	[ 202]	[ 0x00ca]	[ Application2]	[ false]	[ false]	[ 0]	[ 256000]	[ Iptc.Application2.Preview]	[ Undefined]	[ Binary image preview data.]	

This is documented here: https://exiv2.org/iptc.html

iptc_tables.png

IPTC API

This is documented here: https://exiv2.org/doc/classExiv2_1_1IptcDataSets.html

iptc_api.png

In C++, you could enumerate the supported tags as follows:

    const Exiv2::DataSet* env = Exiv2::IptcDataSets::envelopeRecordList();
    do {
        std::string desc(env->desc_);
        if ( desc.size() > 20 ) desc = desc.substr(0,18) + " ...";
        std::cout << env->number_ << " : " << env->name_ << " = \"" << desc << "\""   << std::endl;
    } while (++env && env->number_ != 0xffff);

You can print properties of a tag with code such as:

    std::cout << "Properties of Application2.Caption" << std::endl;
    std::cout << "number, recordId: " << Exiv2::IptcDataSets::Caption <<","<< Exiv2::IptcDataSets::application2 <<std::endl;
    std::cout << "Description:      " << Exiv2::IptcDataSets::dataSetDesc
                                        (Exiv2::IptcDataSets::Caption, Exiv2::IptcDataSets::application2) << std::endl;
    std::cout << "Repeatable:       "  << Exiv2::IptcDataSets::dataSetRepeatable
                                      (Exiv2::IptcDataSets::Caption, Exiv2::IptcDataSets::application2) << std::endl;

Which would result in the following output:

Properties of Application2.Caption
number, recordId: 120,2
Description:      A textual description of the object data.
Repeatable:       0

You can define and enumerate tags in C++ with code such as:

    Exiv2::IptcData iptcData;

    iptcData["Iptc.Envelope.ModelVersion"] = 42;
    iptcData["Iptc.Envelope.TimeSent"    ] = "14:41:0-05:00";
    iptcData["Iptc.Application2.Headline"] = "I am a headline";
    iptcData["Iptc.Application2.Keywords"] = "First Keyword";
    iptcData["Iptc.Application2.Keywords"] = "Another Keyword";
    iptcData["Iptc.Application2.Caption" ] = "First Caption";
    iptcData["Iptc.Application2.Caption" ] = "Another Caption";
    iptcData["Iptc.Application2.DateCreated"] = "2004-8-3";
    iptcData["Iptc.0x0009.0x0001"        ] = "Who am I?";

Which results in:

Iptc.Envelope.ModelVersion 42
Iptc.Envelope.TimeSent 14:41:00-05:00
Iptc.Application2.Headline I am a headline
Iptc.Application2.Keywords Another Keyword
Iptc.Application2.Caption Another Caption
Iptc.Application2.DateCreated 2004-08-03
Iptc.0x0009.0x0001 Who am I?

The API enables you to use IPTC Sections and tags for which Exiv2 has no symbolic name.

You will observe that redefining Keywords and Caption overwrites the stored value in iptcData. To write multiple keywords, use add() as follows:

    Exiv2::IptcData iptcData;
    Exiv2::Value::AutoPtr iptc_value = Exiv2::Value::create(Exiv2::string);

    Exiv2::IptcKey        iptc_keywords("Iptc.Application2.Keywords");
    iptc_value->read("Keyword 1"); iptcData.add(iptc_keywords, iptc_value.get());
    iptc_value->read("Keyword 2"); iptcData.add(iptc_keywords, iptc_value.get());

    Exiv2::IptcKey        iptc_caption("Iptc.Application2.Caption");
    iptc_value->read("Caption 1"); iptcData.add(iptc_caption, iptc_value.get());
    iptc_value->read("Caption 2"); iptcData.add(iptc_caption, iptc_value.get());

    for (Exiv2::IptcData::iterator md = iptcData.begin(); md != iptcData.end(); ++md) {
        std::cout << md->key() << " " << md->value() << std::endl;
    }

Which results as follows, with caption having a single value because it is not repeatable.

Iptc.Application2.Keywords Keyword 1
Iptc.Application2.Keywords Keyword 2
Iptc.Application2.Caption Caption 1

https://clanmills.com/exiv2/book/#IPTC-Exiv2

Comments Welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support Anything related to a user needing help with exiv2
Projects
None yet
Development

No branches or pull requests

2 participants