diff --git a/nemo/collections/diffusion/readme.rst b/nemo/collections/diffusion/readme.rst index 13627ac468c7..38df88c13955 100644 --- a/nemo/collections/diffusion/readme.rst +++ b/nemo/collections/diffusion/readme.rst @@ -124,7 +124,7 @@ Parallel Configuration Energon's architecture allows it to efficiently distribute data across multiple processing units, ensuring that each GPU or node receives a balanced workload. This parallelization not only increases the overall throughput of data processing but also helps in maintaining high utilization of available computational resources. -Mixed Image-Video Training (comming soon) +Mixed Image-Video Training ------------------------------ Our dataloader provides support for mixed image-video training by using the NeMo packed sequence feature to pack together images and videos of varying length into the same microbatch. The sequence packing mechanism uses the THD attention kernel, which allows us to increase the model FLOPs utilization (MFU) and efficiently process data with varying length.