Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I retrieve additional metadata information from the Spatial HEIC #1164

Open
bigcat88 opened this issue Apr 25, 2024 · 25 comments
Open

Comments

@bigcat88
Copy link
Contributor

Original issue with file example is here:

bigcat88/pillow_heif#234

Is there a way to get this information from an image and save the file so it contains it?

image

I tried to find where it is stored using heif-info -d but totally get lost in number of different boxes those file contains..

I would be grateful for any help.

@farindk
Copy link
Contributor

farindk commented Apr 25, 2024

There are four properties in the file that might hold this information:

| | | Box: 4363e914-5b7d-4aab-97ae-bea6983b434 -----
| | | size: 28   (header size: 24)
| | | 
| | | Box: 22cc4c7-d6d9-4e7-9d90-4eb6ecbaf3a3 -----
| | | size: 40   (header size: 24)
...
| | | Box: 4363e914-5b7d-4aab-97ae-bea6983b434 -----
| | | size: 32   (header size: 24)
| | | 
| | | Box: de225085-36cb-4365-8743-2f875e7c78a -----
| | | size: 28   (header size: 24)

But we have no specification of those. They are proprietary. We would need a couple of images with known intrinsic parameters (covering several different values) to be able to reverse engineer their content.

@jwheeler-work
Copy link

jwheeler-work commented Apr 26, 2024

There should be additional metadata for each picture. Using the Image I/O framework, they'd be defined like this:

       let properties = [
            kCGImagePropertyGroups: [
                kCGImagePropertyGroupIndex: 0,
                kCGImagePropertyGroupType: kCGImagePropertyGroupTypeStereoPair,
                kCGImagePropertyGroupImageIndexLeft: 0,
                kCGImagePropertyGroupImageIndexRight: 1,
            ],
            kCGImagePropertyHEIFDictionary: [
                kIIOMetadata_CameraModelKey: [
                    kIIOCameraModel_Intrinsics: cameraIntrinsics as CFArray
                ]
            ]
        ]

@jwheeler-work
Copy link

IMG_0050.zip
Does this help? 3 images from the Apple Vision Pro.

@farindk
Copy link
Contributor

farindk commented Apr 26, 2024

Thanks. Could you please also send me the decoded metadata values that are stored in there? I don't have a Mac to read them out.

@jwheeler-work
Copy link

IMG_0049
IMG_0050
IMG_0051

This good?

@bradh
Copy link
Contributor

bradh commented Apr 26, 2024

Looks like its the same in each case. Do they ever vary?

@jwheeler-work
Copy link

They can. In this case, because they were all shots on the Vision Pro in the same area the values are the same.

I have panoramic photos that will have different intrinsics, but the format is the same.

I suspect there's additional tags that Apple is using to determine this is a stereo pair. That's the code that I posted before.

@bradh
Copy link
Contributor

bradh commented Apr 27, 2024

So to work out which values correspond with which bytes (or bits) in those UUID fields, we need to see the variations. Ideally one parameter change at a time would vary a small amount of the values.

@jwheeler-work
Copy link

Hmm... I don't think I can provide that. The best I can do is more samples of photos that work on the Vision Pro by either taking photos with an iPhone or the AVP.

@JoanCharmant
Copy link

JoanCharmant commented Apr 29, 2024

Hi, (I posted the report in the other repo)

Here is a set of 3 files focused on the "Camera model" field with the intrinsics matrix.

intrinsics.zip

Example:

heic-intrinsics-1

They are based on a similar code snippet as posted above with only variations in the kIIOCameraModel_Intrinsics field.
It’s not possible to create the files with all zeros or just changing one value as the encoder tests that the matrix is valid.

Files and corresponding kIIOCameraModel_Intrinsics value:

  • 0-intrinsics-identity.heic : [1, 0, 0, 0, 1, 0, 0, 0, 1]
  • 1-intrinsics-2.heic: [2, 0, 2, 0, 2, 2, 0, 0, 1]
  • 2-intrinsics-100.heic: [100, 0, 100, 0, 100, 100, 0, 0, 1]

Note: I only used integers but this is an array of floats.

@JoanCharmant
Copy link

JoanCharmant commented Apr 29, 2024

And here is a set of files focusing on the camera extrinsics key.

exitrinsics.zip

Example:

camera-extrinsics

Files:

  • 5-extrinsics-identity.heic: default values corresponding to the screenshot above. In other words, CoordinateSystemID: 0, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].
  • 6-extrinsics-position-x1.heic: same as default except for Position: [1, 0, 0]
  • 7-extrinsics-position-y1.heic: same as default except for Position: [0, 1, 0]
  • 8-extrinsics-position-z1.heic: same as default except for Position: [0, 0, 1]
  • 9-extrinsics-coordinatesystemid1.heic: same as default except CoordinateSystemID: 1, however it still says 0 in the infobox. I don’t know if it’s the encoder or the decoder that’s not picking it up. I don’t know what this value represents.
  • 10-extrinsics-intrinsics.heic: same as default but also has the default intrinsics key from my previous post.

@bradh
Copy link
Contributor

bradh commented May 2, 2024

Intrinsics analysis

File 0 - two UUID properties, and each of the two items are associated with both.

Intrinsics matrix [1, 0, 0, 0, 1, 0, 0, 0, 1]
uuid: 22cc04c7d6d94e079d904eb6ecbaf3a3
value: 00001e00 0010624e 00000000 00000000

uuid: de22508536cb436587432f8705e7c78a
value: 00000000

File 1 - same two UUID properties, same association

Intrinsics matrix [2, 0, 2, 0, 2, 2, 0, 0, 1]
uuid: 22cc04c7d6d94e079d904eb6ecbaf3a3
value: 00001e00 0020c49c 0020c49c 0020c49c

uuid: de22508536cb436587432f8705e7c78a
value: 00000000

File 2 - same two UUID properties, same association

Intrinsics matrix [100, 0, 100, 0, 100, 100, 0, 0, 1]
uuid: 22cc04c7d6d94e079d904eb6ecbaf3a3
value: 00001e00 06666666 06666666 06666666

uuid: de22508536cb436587432f8705e7c78a
value: 00000000

Assume the 22cc04c7d6d94e079d904eb6ecbaf3a3 is the identifier for the intrinsics

So we have
00001e00 0010624e 00000000 00000000 for [1, 0, 0, 0, 1, 0, 0, 0, 1]
00001e00 0020c49c 0020c49c 0020c49c for [2, 0, 2, 0, 2, 2, 0, 0, 1]
00001e00 06666666 06666666 06666666 for [100, 0, 100, 0, 100, 100, 0, 0, 1]

Its not clear to me how the 9 values could fit into 16 bytes unless there is some kind
of encoding, possibly omitting some values that are defined as 0 (e.g. 4th, 7th and 8th values).

Possibly the 0x1e relates to a signature or encoded length (0x1e = 30, the number of bytes is 16).

@farindk
Copy link
Contributor

farindk commented May 2, 2024

A general intrinsic matrix usually looks like this:

f s x
0 f y
0 0 1

One can also assume that the skew s = 0.
That would leave us with just the three parameters f, x, y.

It is also nice to see that the encoding of 2=0x20c49c is exactly two times 1=0x10624e.
And if we divide 0x06666666 / 0x10624e, we also get decimal 100 (almost). Seems to fit nicely.

Thus, first four bytes unknown, maybe some flags (e.g. for "ModelType = Simplified Pinhole")
Second four bytes: f
Third / fourth four bytes: x/y, but we need more data to differentiate that.

@bradh
Copy link
Contributor

bradh commented May 3, 2024

Extrinsics data extraction

Each file has two uuid properties (boxes).
Both images in each file are associated with both uuid properties.

One is
uuid: de22508536cb436587432f8705e7c78a
value: 00000000
as described above.

The other is more interesting. It is
uuid: 4363e9145b7d4aab97aebea69803b434
and the property value changes byte values and length. See below

5-Extrinsics

value: 00000010
CoordinateSystemID: 0, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].

6-Extrinsics

value: 00000011 000f4240
CoordinateSystemID: 0, Position: [1, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].

7-Extrinsics

value: 00000012 000f4240
CoordinateSystemID: 0, Position: [0, 1, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].

8-Extrinsics

value: 00000014 000f4240
CoordinateSystemID: 0, Position: [0, 0, 1], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].

9-Extrinsics

value: 00000010
CoordinateSystemID: 0, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].

10-Extrinsics

value: 00000010
CoordinateSystemID: 1, Position: [0, 0, 0], Rotation: [1, 0, 0, 0, 1, 0, 0, 0, 1].

File 10 also has an extra uuid box:
Its the same as the assumed intrinsics box above.
value: 00001e00 0010624e 00000000 00000000

So it looks like coordinate system id may be by position (not coded).

byte [7] is clearly changing as we step through the position changes.

000f4240 is 3.03216553 as a little endian float. Can't make that fit particularly well though.

@TimYao18
Copy link

also, if the photo is a spatial photo, there will be a 'grpl' box under 'meta' box where the 'grpl' Grouplistbox box will contain a 'ster' box that is Stereoscopic pair.

@TimYao18
Copy link

test_images.zip

I have attached two identical images, but one of them, spatial.HEIC, is a Spatial Photo that add the key metadata in it.
I think this can be easily compared.
I want to fix this issue by myself but I'm not good in C++ that I don't know where to start.

@bradh
Copy link
Contributor

bradh commented May 17, 2024

I have attached two identical images, but one of them, spatial.HEIC, is a Spatial Photo that add the key metadata in it.

Can you show the associated key metadata (i.e. as apple displays it)?

@TimYao18
Copy link

TimYao18 commented May 17, 2024

There should be additional metadata for each picture. Using the Image I/O framework, they'd be defined like this:

       let properties = [
            kCGImagePropertyGroups: [
                kCGImagePropertyGroupIndex: 0,
                kCGImagePropertyGroupType: kCGImagePropertyGroupTypeStereoPair,
                kCGImagePropertyGroupImageIndexLeft: 0,
                kCGImagePropertyGroupImageIndexRight: 1,
            ],
            kCGImagePropertyHEIFDictionary: [
                kIIOMetadata_CameraModelKey: [
                    kIIOCameraModel_Intrinsics: cameraIntrinsics as CFArray
                ]
            ]
        ]

What I do to these 2 images is one image generated with above code, another without the code.
So we can compare their file structure using heif-info.exe -d to these 2 images.
Or we can use some isobmff tool like pyisobmff.

I attached two file that generated by pyisobmff: pyisobmff_decode.zip

Just use the text comparing tools to check the difference. This is currently I can do so far.
Also, if you know the box specific in the spatial images, you might use hex editor to search the brand and see what value it has.

Below are screenshots that spatial image contained more than non_spatial image:

"uuid"

Screenshot 2024-05-17 194953

"uuid2"

Screenshot 2024-05-17 195011

"grpl" that it contains 'ster' box inside of it but pyisobmff cannot decode it.

Screenshot 2024-05-17 195029

Can you show the associated key metadata (i.e. as apple displays it)?

What the difference is the image @JWheeler and @JoanCharmant posted, the preview app in macOS will plus a tag "HEIC" that non-spatial image doesn't have.

@TimYao18
Copy link

Hi all, I found the UUID also related to the image resolution. If I changed the image resolution, the value will change, too.

@farindk
Copy link
Contributor

farindk commented Jun 16, 2024

Reading and writing of the camera intrinsic matrix should be working now in branch develop-v1.18.0. Extrinsic matrix will follow shortly.

@farindk
Copy link
Contributor

farindk commented Jun 16, 2024

Is there some test data for the extrinsic camera matrix? Especially with camera orientation once specified as a quaternion and once with rotation angles?

@bradh
Copy link
Contributor

bradh commented Jun 16, 2024

Assuming its the same as cmin and cmex, there are test examples at

MPEGGroup/FileFormatConformance#86

and

MPEGGroup/FileFormatConformance#85

@farindk
Copy link
Contributor

farindk commented Jun 17, 2024

Assuming its the same as cmin and cmex, there are test examples at

Thank you. That helped to confirm that intrinsics and quaternion-based rotation are read correctly. I have also chosen the rotation sequence order to match the output described in the rotation.txt file at that repository.

@takeru
Copy link

takeru commented Nov 1, 2024

While there is a function in heif.h for obtaining the rotation of the extrinsic_matrix (namely, heif_camera_extrinsic_matrix_get_rotation_matrix), there is no function for obtaining the position. Why is that?

In heif_experimental.h, both heif_property_camera_extrinsic_matrix_get_rotation_matrix and heif_property_camera_extrinsic_matrix_get_position_vector are available.

@farindk
Copy link
Contributor

farindk commented Nov 1, 2024

There is still some discussion about the exact semantics (MPEGGroup/FileFormat#102).
When this has been finally decided, the implementation will be moved to the stable API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants