DALI 2021 roadmap #2978

JanuszL · 2021-05-19T16:03:12Z

The following represents a high-level overview of our 2021 plan. You should be aware that this roadmap may change at any time and the order below does not reflect any type of priority.

We strongly encourage you to comment on our roadmap and provide us feedback on this issue here.

Improving Usability:

[Done] improving functional API - improving user experience by simplification of DALI functional API, grouping operations into modules, and extending examples to show how to use it - Rework frameworks notebooks to fn API #2761, Convert notebooks to fn API: sequence_processing #2748, Rewrite image processing examples to fn api. #2745, Convert notebooks to fn API: audio_processing, custom_operator, serialization #2744, Adjust notebooks to new decoder module #2743, Rework getting started #2729, Move decoders to decoders module #2725, Move tfrecord reader to readers module #2722, Adjust examples to new readers module #2721, Update augmentation gallery #2716, Documentation home update #2713, Documentation: New layout of Examples and Tutorials section #2710, Add documentation to functional API (all fn.*) + New documentation layout #2653, Pipeline decorator #2629, Move examples to fn api #2566, Rework custom operator docs #2568
[In progress] eager mode - introducing DALI operators callable as standalone entities to simplify debugging and prototyping - Prototype of the debug mode #3531
[In progress] conditional execution - ability to conditionally apply each operation, providing auto Augmentor style capabilities - Make Workspace::Input return const reference #3452, Rename InputRef/OutputRef to Input/Output in workspace API #3451, Reduce number of Workspace Input/Output APIs #3446, Reduce the TensorList and TensorVector API scope #3403, Remove access to contiguous TL buffer from Coco Reader tests #3351, Remove access to contiguous TL buffer from BoxEncoder, Resize, Shapes and Warp #3339, Remove access to underlying contiguous TL buffer from tests #3319, Rework diplacement filter to sample-based approach #3311, Remove access to underlying contiguous TL buffer in bb_flip op #3283, Remove access to underlying contiguous TL buffer in Flip op #3280, Remove access to underlying contiguous TL buffer in Normalize op #3281, Remove access to underlying contiguous TL buffer in Constant op #3276, Remove Buffer inheritence from TensorList #3576, Ensure sample encapsulation in Tensor Vector #3701
intra-pipeline batch size variability - providing an ability to change batch size from operator to operator inside the execution graph
- [Done] Iteration-to-iteration batch size variability with non-parallel External Source - Add FW iterators handling of variable batch size and improve ES examples #2641
[Done] tensor indexing - Tensor indexing #3195

Extending input format support:

enabling support of variable frame rate videos
- [IN PROGRESS] Add frames decoder #3362, Add VideoReaderDecoder #3391
enabled additional video formats:
- [DONE] VP8 and MJPEG - Add support for VP8 and MJPEG videos #3045
parallelization of external source operators to scale to multiple CPU cores
- [Done] Parallel External Source for per-sample callbacks - Run external source callback in parallel #2543
- [Done] Support for lambdas and nested functions in Parallel External Source - Support lambdas and local functions as callbacks in parallel ExternalSource #3269
external source support for TensorFlow - Add experimental input support to TF DALIDataset #2997, Enable no_copy mode handling in TF DALI Dataset #3058, Add batch support to DALI Dataset #3089

Performance:

reducing memory consumption and allocation overhead by introducing custom, pooled memory allocator - DALI core allocation functions #2930, Default memory resources #2890, Composite resource + renaming. #2891, Pinned async resource #2858, Asynchronous pool memory resource #2814, Update memory resource interfaces. #2742, Integrate RMM #2609, Use default resources for allocating tensors #2948, Make data objects stream-aware #3536
acceleration for audio decoding - providing acceleration for decoding various audio formats

New transformations:

We are constantly extending the set of operations supported by DALI. Currently, this section lists the most notable additions to our areas of interest that were done this year. This list is not exhaustive and we plan on expanding the set of operators as the needs or requests arise.

new transformations for images processing
- JpegCompressionDistortion operator - Add JpegCompressionDistortion CPU and GPU operators #2823
- MultiPaste operator - GPU MultiPaste #2681, MultiPaste operator #2583
- Gridmask operator - Gridmask Gpu #2652, Gridmask Cpu #2582
- RandomObjectBBox operator - Operator RandomObjectBBox #2657
- ROIRandomCrop operator - Add ROIRandomCrop operator #2638
new transformations for audio processing
- time-major layout in the Spectrogram operator - Enable time-major layout in Spectrogram CPU #2619, Time major Spectrogram (GPU-only) #2617
- time-major layout in the MelFilterBank GPU operator - Enable support for different layouts in the MelFilterBank GPU Op #2620
new transformations for 3D/volumetric data processing
- RandomObjectBBox operator - Operator RandomObjectBBox #2657
- ROIRandomCrop operator - Add ROIRandomCrop operator #2638
new transformations for video processing
new generic operations
- SaltAndPepper operator - Add SaltAndPepper GPU operator #2956, Add Salt and Pepper noise CPU operator #2889
- ShotNoise operator - Add ShotNoise CPU and GPU operators #2861
- more mathematical operations - Add more mathematical operations #2853
- squeeze operator - Add squeeze operator #2792
- NumbaFunc operator - Add NumbaFunc operator #2804

zhimengf · 2021-06-04T01:37:54Z

I'm really looking forward to the support of YOLO v3/v4. Including,

multiscale-training. Input images' height and width are randomly generated and changes every N training steps. Currently DALI can only resize image to fixed size. So this is not supported out-of-the-box by DALI
YOLO augmentations, hsv augmentation(YOLO style) and jitter(YOLO style, not color jitter)
YOLO label encoder

JanuszL · 2021-06-07T09:04:26Z

Hi @zhimengf,

I'm really looking forward to the support of YOLO v3/v4. Including,

multiscale-training. Input images' height and width are randomly generated and changes every N training steps. Currently DALI can only resize image to fixed size. So this is not supported out-of-the-box by DALI

YOLO augmentations, hsv augmentation(YOLO style) and jitter(YOLO style, not color jitter)

YOLO label encoder

In this case you can use the external source operator to load the parameters and pass them to the other operators. Mentioned resize operator accepts arguments for other operators - including mentioned external source.
Could you be more specific about what does it mean YOLO style - what DALI does and what you would expect to have?
We don't plan to add such an encoder as it is specific for YOLO. We have an encoder for SSD as this was the very first detection network DALI has started support, we would prefer to rely on the community to add encoders for other networks. Moreover, some implementations make the encoding the part of the network itself, not the data processing part.

zhimengf · 2021-06-09T07:58:44Z

Hi @JanuszL

Could you give an short example on how to achieving this with DALI API?
I just mean the hsv augmentation in yolo v3 is not the same as the hsv augmentation that is already implemented in DALI. So I'm expecting DALI to provide the exact hsv augmentation as in yolo v3.
That makes sense to me. Thanks!

JanuszL · 2021-06-10T08:31:11Z

Hi @zhimengf,

This simple example shows how you can use an external source to feed parameters to resize. You can read more about the external_source usage in this example.

import os
import numpy as np
import nvidia.dali.fn as fn
import nvidia.dali.types as types
from nvidia.dali import pipeline_def

batch_size = 64

def get_data():
    out = [(np.random.ranf(size=[2]).astype(dtype=np.float32)*60 + 30) for _ in range(batch_size)]
    return out

@pipeline_def
def simple_pipeline():
    jpegs, _ = fn.readers.file(file_root=os.environ['DALI_EXTRA_PATH'] + "/db/single/jpeg")
    images = fn.decoders.image(jpegs)
    size = fn.external_source(source=get_data)
    images = fn.resize(images, size=size)
    return images

pipe = simple_pipeline(batch_size=batch_size, num_threads=4, prefetch_queue_depth=2, device_id=0)
pipe.build()
out = pipe.run()[0]
print(np.array(out[0]).shape)
out = pipe.run()[0]
print(np.array(out[0]).shape)
out = pipe.run()[0]
print(np.array(out[0]).shape)

I'm sorry but I still don't understand where is the difference based on your description (I'm not familiar with how hsv/jitter works/is defined in YOLO v3). Could you describe the behavior you expect? Maybe it is just a different API and you can still use DALI but you need to express the transformation differently?

JanuszL · 2022-03-30T12:54:56Z

We have 2022 now. Closing this in favor of #3774.

carmocca · 2022-05-19T17:04:13Z

The README roadmap section still points to this outdated issue: https://github.com/NVIDIA/DALI#dali-roadmap

JanuszL · 2022-05-19T18:36:29Z

@carmocca - good catch. #3918 should fix that.

JanuszL pinned this issue May 19, 2021

awolant unpinned this issue Jun 24, 2021

awolant pinned this issue Jun 24, 2021

JanuszL mentioned this issue Sep 9, 2021

Pythonic FITS reader #1273

Open

JanuszL mentioned this issue Sep 17, 2021

Can DALI read from HDF5 file? #1252

Open

gau-nernst mentioned this issue Feb 6, 2022

checklist gau-nernst/vision-toolbox#1

Open

12 tasks

jantonguirao assigned JanuszL Mar 30, 2022

JanuszL mentioned this issue Mar 30, 2022

DALI 2022 roadmap #3774

Closed

JanuszL closed this as completed Mar 30, 2022

klecki unpinned this issue Mar 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DALI 2021 roadmap #2978

DALI 2021 roadmap #2978

JanuszL commented May 19, 2021 •

edited

Loading

zhimengf commented Jun 4, 2021

JanuszL commented Jun 7, 2021 •

edited

Loading

zhimengf commented Jun 9, 2021

JanuszL commented Jun 10, 2021

JanuszL commented Mar 30, 2022

carmocca commented May 19, 2022

JanuszL commented May 19, 2022

DALI 2021 roadmap #2978

DALI 2021 roadmap #2978

Comments

JanuszL commented May 19, 2021 • edited Loading

Improving Usability:

Extending input format support:

Performance:

New transformations:

zhimengf commented Jun 4, 2021

JanuszL commented Jun 7, 2021 • edited Loading

zhimengf commented Jun 9, 2021

JanuszL commented Jun 10, 2021

JanuszL commented Mar 30, 2022

carmocca commented May 19, 2022

JanuszL commented May 19, 2022

JanuszL commented May 19, 2021 •

edited

Loading

JanuszL commented Jun 7, 2021 •

edited

Loading