Skip to content

Commit

Permalink
Merge pull request #101 from okotaku/feat/docs_aspect_ratio_bucketing
Browse files Browse the repository at this point in the history
[Docs] Aspect Ratio Bucketing
  • Loading branch information
okotaku authored Nov 29, 2023
2 parents c85520a + 329b71e commit 1854dac
Show file tree
Hide file tree
Showing 2 changed files with 78 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Welcome to diffengine's documentation!

user_guides/config.md
user_guides/dataset_prepare.md
user_guides/aspect_ratio_bucketing.md


.. _RunGuides:
Expand Down
77 changes: 77 additions & 0 deletions docs/source/user_guides/aspect_ratio_bucketing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Aspect Ratio Bucketing

Training with aspect ratio bucketing can greatly improve the quality of outputs.
For more details, you can check [NovelAI Aspect Ratio Bucketing](https://github.com/NovelAI/novelai-aspect-ratio-bucketing).

## Finetune

To use Aspect Ratio Bucketing in finetune, you need to follow these steps:

1. Fix the dataset config.

Change `torchvision/Resize` and `RandomCrop` to `MultiAspectRatioResizeCenterCrop`. Also, use `AspectRatioBatchSampler`.

```
train_pipeline = [
dict(type="SaveImageShape"),
dict(type='MultiAspectRatioResizeCenterCrop',
sizes=[
[640, 1536], [768, 1344], [832, 1216], [896, 1152],
[1024, 1024], [1152, 896], [1216, 832], [1344, 768], [1536, 640]
],
interpolation='bilinear'),
dict(type="RandomHorizontalFlip", p=0.5),
dict(type="ComputeTimeIds"),
dict(type="torchvision/ToTensor"),
dict(type="torchvision/Normalize", mean=[0.5], std=[0.5]),
dict(type="PackInputs", input_keys=["img", "text", "time_ids"]),
]
train_dataloader = dict(
...
dataset=dict(
...
pipeline=train_pipeline),
sampler=dict(type="DefaultSampler", shuffle=True),
batch_sampler=dict(type='AspectRatioBatchSampler'),
)
```

2. Run training.

## ControlNet

To use Aspect Ratio Bucketing in ControlNet, you need to follow these steps:

1. Fix dataset config.

```
train_pipeline = [
dict(type="SaveImageShape"),
dict(
type="MultiAspectRatioResizeCenterCrop",
sizes=[
[640, 1536], [768, 1344], [832, 1216], [896, 1152],
[1024, 1024], [1152, 896], [1216, 832], [1344, 768], [1536, 640]
],
interpolation='bilinear',
keys=["img", "condition_img"]),
dict(type="RandomHorizontalFlip", p=0.5, keys=["img", "condition_img"]),
dict(type="ComputeTimeIds"),
dict(type="torchvision/ToTensor", keys=["img", "condition_img"]),
dict(type="DumpImage", max_imgs=10, dump_dir="work_dirs/dump"),
dict(type="torchvision/Normalize", mean=[0.5], std=[0.5]),
dict(
type="PackInputs",
input_keys=["img", "condition_img", "text", "time_ids"]),
]
train_dataloader = dict(
...
dataset=dict(
...
pipeline=train_pipeline),
sampler=dict(type="DefaultSampler", shuffle=True),
batch_sampler=dict(type='AspectRatioBatchSampler'),
)
```

2. Run training.

0 comments on commit 1854dac

Please sign in to comment.