Skip to content

Tiled Diffusion

Kahsolt edited this page Mar 30, 2024 · 1 revision

Tiled Diffusion

TiledDiffusion

  • From the illustration, you can see how is an image split into tiles.
    • In each step, each tile in the latent space will be sent to Stable Diffusion UNet.
    • The tiles are split and fused over and over again until all steps are completed.
  • What is a good tile size?
    • A larger tile size will increase the speed because it produces fewer tiles.
    • However, the optimal size depends on your checkpoint. The basic SD1.4 is only good at drawing 512 * 512 images (SD2.1 will be 768 * 768). And most checkpoints cannot generate good pictures larger than 1280 * 1280. So in latent space let's divide this by 8, and you will get 64 - 160.
    • Hence, you should pick a value between 64 - 160.
    • Personally, I recommend 96 or 128 for fast speed.
  • What is a good overlap?
    • The overlap reduces seams in fusion. Obviously, a larger overlap means fewer seams, but will significantly reduce the speed as it brings much more tiles to redraw.
    • Compared to MultiDiffusion, Mixture of Diffusers requires less overlap because it uses Gaussian smoothing (and therefore can be faster).
    • Personally, I recommend 32 or 48 for MultiDiffusion, 16 or 32 for Mixture of Diffusers
  • Upscaler will appear in i2i. You can select one to upscale your image in advance.

txt2img: generating ultra-large images

ℹ Please use simple positive prompts at the top of the page, as they will be applied to each tile. ℹ If you want to add objects to a specific position, use regional prompt control and enable draw full canvas background

Example 1: masterpiece, best quality, highres, city skyline, night.

panorama

Example 2: cooperate with ControlNet to convert ancient wide paintings

  • 22020 x 1080 ultra-wide image conversion

Example 3: 2560 * 1280 large image drawing

  • ControlNet (canny edge)

Your Name yourname


img2img: upscaling for details

Leverage Tiled Diffusion to upscale & redraw large images

Example: 1024 * 800 -> 4096 * 3200 image, with default params

  • Params:

    • denoise=0.4, steps=20, Sampler=Euler a, Upscaler=RealESRGAN++, Negative Prompts=EasyNegative,
    • Ckpt: Gf-style2 (4GB version), CFG Scale = 14, Clip Skip = 2
    • method = MultiDiffusion, tile batch size = 8, tile size height = 96, tile size width = 96, overlap = 32
    • Prompt = masterpiece, best quality, highres, extremely detailed 8k wallpaper, very clear, Neg prompt = EasyNegative.
  • Before upscaling lowres

  • After 4x upscale, No cherry-picking. 1min12s on NVIDIA Tesla V100. (If 2x, it completes in 10s) highres

Special tips for Upscaling

  • Recommend Parameters for Efficient Upscaling.
    • Sampler = Euler a, steps = 20, denoise = 0.35, method = Mixture of Diffusers, Latent tile height & width = 128, overlap = 16, tile batch size = 8 (reduce tile batch size if see CUDA out of memory).
  • We are compatible with masked inpainting
    • If you want to keep some parts, or the Tiled Diffusion gives you weird results, just mask these areas.
  • The checkpoint is crucial.
    • MultiDiffusion works very similar to highres.fix, so it highly relies on your checkpoint.
    • A checkpoint that is good at drawing details can add amazing details to your image.
    • A full checkpoint instead of a pruned one can yield much finer results.
  • Don't include any concrete objects in your main prompts, otherwise, the results get ruined.
    • Just use something like "highres, masterpiece, best quality, ultra-detailed 8k wallpaper, extremely clear".
    • And use regional prompt control for concrete objects if you like.
  • You don't need too large tile size, large overlap and many denoising steps, or it can be very slow.
  • CFG scale can significantly affect the details.
    • A large CFG scale (e.g., 14) gives you much more details.
  • You can control how much you want to change the original image with denoising strength from 0.1 - 0.6.
  • If your results are still not as satisfying as mine, see our discussions here.