Diffusion models

Published:
Published:

Table of contents

Also: Stable Diffusion and 3D

Tutorials

Papers

Our neural network architecture follows the backbone of PixelCNN++, which is a U-Net based on a Wide ResNet

Denoising Diffusion Probabilistic Models

condition on whole pixels, rather than R/G/B sub-pixels

PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications

Extra

It's possible to get access to TPU v3-8 (similar to 8 V100 GPUs) with Google TRC program.

On top you need the Super resolution models. From IF, for example. And tutorial to use small memory footprint.

Lessons

Diffusion model staggers and not improving loss

It’s normal. Diffusion models can plateau at 0.1 or somewhere there. It’s better not fixate on that

Why batch size not reducing train time?

From this tweet TL;DR:

  • Maximum speed: Largest batch_size as possible.
  • Maximum generalization: small batch_size, and increase throughout the training.

There is a lot of confusion about neural networks batch size during training, here is what I know. The batch_size is a balance between the training speed and generalization performance. Generally, up to a certain limit (can be around around 8-9 samples), the smaller the batch: the better the generalization performance on the validation set. In addition: Increasing the batch_size throughout the training also helps with the validation performance.

If you changed your batch_size, it is important to also change the learning_rate as well. A good ratio for this is according to the ratio of the batch_size change. Larger batches: Need larger learning_rate. Smaller batches Need smaller learning_rate.

Rate this page