Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DALI 2021 roadmap #2978

Closed
JanuszL opened this issue May 19, 2021 · 7 comments
Closed

DALI 2021 roadmap #2978

JanuszL opened this issue May 19, 2021 · 7 comments
Assignees

Comments

@JanuszL
Copy link
Contributor

JanuszL commented May 19, 2021

The following represents a high-level overview of our 2021 plan. You should be aware that this roadmap may change at any time and the order below does not reflect any type of priority.

We strongly encourage you to comment on our roadmap and provide us feedback on this issue here.

Improving Usability:

Extending input format support:

Performance:

New transformations:

We are constantly extending the set of operations supported by DALI. Currently, this section lists the most notable additions to our areas of interest that were done this year. This list is not exhaustive and we plan on expanding the set of operators as the needs or requests arise.

@JanuszL JanuszL pinned this issue May 19, 2021
@zhimengf
Copy link

zhimengf commented Jun 4, 2021

I'm really looking forward to the support of YOLO v3/v4. Including,

  1. multiscale-training. Input images' height and width are randomly generated and changes every N training steps. Currently DALI can only resize image to fixed size. So this is not supported out-of-the-box by DALI
  2. YOLO augmentations, hsv augmentation(YOLO style) and jitter(YOLO style, not color jitter)
  3. YOLO label encoder

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 7, 2021

Hi @zhimengf,

I'm really looking forward to the support of YOLO v3/v4. Including,

  1. multiscale-training. Input images' height and width are randomly generated and changes every N training steps. Currently DALI can only resize image to fixed size. So this is not supported out-of-the-box by DALI
  2. YOLO augmentations, hsv augmentation(YOLO style) and jitter(YOLO style, not color jitter)
  3. YOLO label encoder
  1. In this case you can use the external source operator to load the parameters and pass them to the other operators. Mentioned resize operator accepts arguments for other operators - including mentioned external source.
  2. Could you be more specific about what does it mean YOLO style - what DALI does and what you would expect to have?
  3. We don't plan to add such an encoder as it is specific for YOLO. We have an encoder for SSD as this was the very first detection network DALI has started support, we would prefer to rely on the community to add encoders for other networks. Moreover, some implementations make the encoding the part of the network itself, not the data processing part.

@zhimengf
Copy link

zhimengf commented Jun 9, 2021

Hi @JanuszL

  1. Could you give an short example on how to achieving this with DALI API?
  2. I just mean the hsv augmentation in yolo v3 is not the same as the hsv augmentation that is already implemented in DALI. So I'm expecting DALI to provide the exact hsv augmentation as in yolo v3.
  3. That makes sense to me. Thanks!

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 10, 2021

Hi @zhimengf,

  1. This simple example shows how you can use an external source to feed parameters to resize. You can read more about the external_source usage in this example.
import os
import numpy as np
import nvidia.dali.fn as fn
import nvidia.dali.types as types
from nvidia.dali import pipeline_def

batch_size = 64

def get_data():
    out = [(np.random.ranf(size=[2]).astype(dtype=np.float32)*60 + 30) for _ in range(batch_size)]
    return out

@pipeline_def
def simple_pipeline():
    jpegs, _ = fn.readers.file(file_root=os.environ['DALI_EXTRA_PATH'] + "/db/single/jpeg")
    images = fn.decoders.image(jpegs)
    size = fn.external_source(source=get_data)
    images = fn.resize(images, size=size)
    return images

pipe = simple_pipeline(batch_size=batch_size, num_threads=4, prefetch_queue_depth=2, device_id=0)
pipe.build()
out = pipe.run()[0]
print(np.array(out[0]).shape)
out = pipe.run()[0]
print(np.array(out[0]).shape)
out = pipe.run()[0]
print(np.array(out[0]).shape)
  1. I'm sorry but I still don't understand where is the difference based on your description (I'm not familiar with how hsv/jitter works/is defined in YOLO v3). Could you describe the behavior you expect? Maybe it is just a different API and you can still use DALI but you need to express the transformation differently?

@JanuszL
Copy link
Contributor Author

JanuszL commented Mar 30, 2022

We have 2022 now. Closing this in favor of #3774.

@JanuszL JanuszL closed this as completed Mar 30, 2022
@klecki klecki unpinned this issue Mar 31, 2022
@carmocca
Copy link

The README roadmap section still points to this outdated issue: https://github.com/NVIDIA/DALI#dali-roadmap

@JanuszL
Copy link
Contributor Author

JanuszL commented May 19, 2022

@carmocca - good catch. #3918 should fix that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants