-
-
Notifications
You must be signed in to change notification settings - Fork 304
Reading and Writing Tiled Images
When reading or writing high-resolution images, it may not be feasible to keep the whole image in memory. For those situations, libheif supports to read or write images as individual tiles. When reading, only the accessed tiles are read from the file and decoded. Similarly, when writing, each tile can be encoded separately and added to the image file. It is even possible to write the image tiles in arbitrary order.
libheif supports three methods to store tiled images:
-
grid
images - This format saves a tiled as a collection of small images. Each tile is handled by the HEIF file as a separate image, which are then combined into a largegrid
image. It has a rather large overhead because metadata has to be stored for each tile image. The format is also limited to only 65535 tiles. This is the format used by most mobile phone cameras that split the image usually into 512x512 tiles. -
unci
images - These are "uncompressed" images according to ISO 23001-17. This format has tiling built into it with only little overhead, but it can only store images compressed with lossless compression algorithms (deflate, brotli). -
tili
images - This is a currently proprietary HEIF extension that, similarly togrid
, compresses each tile independently, but it avoids the metadata overhead and it works for practically unlimited image sizes. It also supports skipped (no-data) tiles and higher dimensional images (like multi-spectral images or 3D volumes). It is optimized for efficient streaming of the image data over networks. More information abouttili
images can be found here.
Reading any of these tiling formats with libheif
uses the same API, thus you do not have to care about the format used in the HEIF file internally.
Note: the functions for writing 'unci' and 'tili' images are still in the heif_experimental.h
header until the API is considered stable. If you installed libheif
through your distribution's package manager, this file might not be installed. Please compile libheif from source and enable INSTALL_EXPERIMENTAL_HEADERS
in your cmake config.
First, write all tile images to the file using the normal API:
struct heif_error heif_context_encode_image(struct heif_context*,
const struct heif_image* image,
struct heif_encoder* encoder,
const struct heif_encoding_options* options,
struct heif_image_handle** out_image_handle);
Collect all heif_image_handle
s returned by the encoding function in an array and then combine all of these tile images as a grid image with
struct heif_error heif_context_add_grid_image(struct heif_context* ctx,
uint32_t image_width,
uint32_t image_height,
uint32_t tile_columns,
uint32_t tile_rows,
const heif_item_id* image_ids,
struct heif_image_handle** out_grid_image_handle);
Note that the image_width
and image_height
does not have to be an integer multiple of the tile sizes. If they are smaller, the extra data at the right and bottom border is removed from the decoded image.
For tiled unci
images, you first create the unci
image and then add the individual tiles to it.
struct heif_unci_image_parameters {
int version;
// --- version 1
uint32_t image_width;
uint32_t image_height;
uint32_t tile_width;
uint32_t tile_height;
enum heif_metadata_compression compression;
// ...
};
struct heif_error heif_context_add_unci_image(struct heif_context* ctx,
const struct heif_unci_image_parameters* parameters,
const struct heif_encoding_options* encoding_options,
const struct heif_image* prototype,
struct heif_image_handle** out_unci_image_handle);
The prototype
parameter is an image with the same color channels and settings as you will use for the individual tiles.
This dummy image is not coded, but only used to specify the image format. You can use very small image planes (1x1) in the prototype image as their size is not used.
For unci
images, different than for grid
images, the image_width
and image_height
must be an integer multiple of the tile sizes.
The compression
parameter selects the lossless compression algorithm.
Now, you can add tiles to the unci
image with
struct heif_error heif_context_add_image_tile(struct heif_context* ctx,
struct heif_image_handle* tild_image,
uint32_t tile_x, uint32_t tile_y,
const struct heif_image* image,
struct heif_encoder* encoder);
The tile_x
, tile_y
parameters specify the tile position as indices (0;0), (0;1), (0;2). These are not pixel coordinates.
For tiled tili
images, you also first create the tili
image and then add the individual tiles to it.
The image is first created with:
struct heif_tiled_image_parameters {
int version;
// --- version 1
uint32_t image_width;
uint32_t image_height;
uint32_t tile_width;
uint32_t tile_height;
uint32_t compression_type_fourcc;
uint8_t offset_field_length; // one of: 32, 40, 48, 64 (bits)
uint8_t size_field_length; // one of: 0, 24, 32, 64 (bits)
uint8_t number_of_extra_dimensions; // 0 for normal images, 1 for volumetric (3D), ...
uint32_t extra_dimensions[8]; // size of extra dimensions (first 8 dimensions)
uint8_t tiles_are_sequential; // (bool) hint whether all tiles are added in sequential order
};
struct heif_error heif_context_add_tiled_image(struct heif_context* ctx,
const struct heif_tiled_image_parameters* parameters,
const struct heif_encoding_options* options,
struct heif_image_handle** out_tiled_image_handle);
The compression_type_fourcc
corresponds to the image item type usually stored in the HEIF file, e.g. hvc1
for h.265, av01
for AVIF.
The tili
image contains a table with offset pointers to the individual tiles in the file.
You can choose the bit-length of these offsets and the tile sizes.
When setting the size_field_length
to 0, no tile size will be stored.
Note that omitting the tile sizes will force the decoder to load the whole offset table when parsing the file, which may be undesirable when the file should be streamed over the network. The size of the table is the combined bit-length of the two fields times the number of tiles.
Now, using the same function as for unci
images, you can add tiles:
struct heif_error heif_context_add_image_tile(struct heif_context* ctx,
struct heif_image_handle* tile_image,
uint32_t tile_x, uint32_t tile_y,
const struct heif_image* image,
struct heif_encoder* encoder);
You have to use the same heif_encoder
with the same settings for all tiles.
The tile_x
, tile_y
parameters specify the tile position as indices (0;0), (0;1), (0;2). These are not pixel coordinates.
Reading tiled images works the same for all image types. The same code will work with all tiling schemes and even with non-tiled images, which will appear like images consisting of a single tile.
Before decoding a tile, the first step should be to get the tiling information with
struct heif_error heif_image_handle_get_image_tiling(const struct heif_image_handle* handle,
int process_image_transformations,
struct heif_image_tiling* out_tiling);
The boolean parameter process_image_transformations
indicates whether libheif
should take care of all image transformations (rotations, mirror, cropping) internally, or whether you want to handle them yourself. If this is enabled, libheif
will also convert the tile coordinates such that it looks to the client application as if the image geometry is not transformed.
The above function returns the following tiling information:
struct heif_image_tiling
{
int version;
// --- version 1
uint32_t num_columns;
uint32_t num_rows;
uint32_t tile_width;
uint32_t tile_height;
uint32_t image_width;
uint32_t image_height;
// Position of the top left tile.
// Usually, this is (0;0), but if a tiled image is rotated or cropped, it may be that the top left tile should be placed at a negative position.
// The offsets define this negative shift.
uint32_t top_offset;
uint32_t left_offset;
uint8_t number_of_extra_dimensions; // 0 for normal images, 1 for volumetric (3D), ...
uint32_t extra_dimension_size[8]; // size of extra dimensions (first 8 dimensions)
};
In general, you should assume that the image_width
and image_height
are no integer multiples of the tile size. This constraint will be the case for unci
images, but not for the other tiling types. When you get tiles overlapping the border, you should ignore it and you should not draw the part that extends beyond the border. libheif
will not crop the border tiles.
Internally, tiles may extend beyond the right and bottom borders, but when you turned on process_image_transformations
, the image may be rotated and cropped and tiles may extend beyond all four borders. For this reason, the fields top_offset
and left_offset
indicate how much of the top and left border should be removed when displaying the image.
Tiled image with ignored right and bottom border.
Image has been rotated by 90 degrees. Now the `left_offset` is greater than 0.
Now you can decode individual tiles with
struct heif_error heif_image_handle_decode_image_tile(const struct heif_image_handle* in_handle,
struct heif_image** out_img,
enum heif_colorspace colorspace,
enum heif_chroma chroma,
const struct heif_decoding_options* options,
uint32_t tile_x, uint32_t tile_y);
Like with encoding, tile_x
and tile_y
specify the tile position index, not the pixel coordinate. The other parameters are the same as for the usual heif_image_handle_decode_image()
function.
Make sure that the heif_decoding_options
value ignore_transformations
is set to !process_image_transformations
.
When opening a file, libheif
will parse the file structure and read the meta
box. When you access a single image tile, it will load specifically only the data of that tile from the file.
This is particularly important when you want to stream the image over a network without first downloading the whole image.
In that case, you can implement the heif_reader
interface and implement it to download the data from the network.
Apart from the usual read
and seek
functions, version 2 of this interface also includes the function heif_reader_range_request_result request_range(uint64_t start_pos, uint64_t end_pos, void* userdata)
.
libheif
will call this function to tell the reader which file range it is going to read next. This should be a blocking call in which you can download the data from the network. The advantage to downloading the file in the read()
function is that the read()
function may read many small chunks, while request_range()
will request larger file ranges that are more efficient to download.
You may even download more data than requested and let libheif
know in the result that this data is available. libheif
may decide to use that extra data if it can make use of it.
Another, optional, function is void preload_range_hint(uint64_t start_pos, uint64_t end_pos, void* userdata)
.
With this function, libheif
lets you know of a file range that it may need in the future.
Contrary to the above, this function should be non-blocking and return immediately. You may want to start a download in the background so that it is ready in case the range is requested later on.
If you are caching network data, you might also be interested in the callback void release_file_range(uint64_t start_pos, uint64_t end_pos, void* userdata)
which is used to let you know if libheif
does not need a specific file range anymore and you can remove it from the cache.
Note that you can remove any data from the cache whenever you want, but it might be that you have to reload it if libheif
will request it again.
Closely related to reading high-resolution images is the feature to store lower-resolution overview images in the same file. These make it easier to display zoomed-out views of the image without having to read large areas of the image at the highest resolution and scaling it down.
Multiresolution image pyramids are stored as a set of images, one for each resolution layer. These layer images are combined into a pyramid that is stored as a pymd
entity group.
You can get the pymd
entity group with heif_entity_group* heif_context_get_entity_groups(const struct heif_context*, uint32_t type_filter, uint32_t item_filter, int* out_num_groups)
. More specifically, you can set the type_filter to heif_fourcc('p','y','m','d')
to get the pyramid directly if it is present. The image item IDs in the entity group are ordered from lowest resolution to highest resolution.
Each layer image in the pyramid can be a tiled image and you can also mix the image types. For example, the highest resolution layer could be an unci
image that stores the lossless compressed data, while the overview images use tili
or grid
with an efficient image codec. It also helps that software that cannot read unci
images can at least show the overview images.
There is an example viewer application. On that page you will also find links to example images.