Skip to content
Snippets Groups Projects
Select Git revision
  • main
  • untested-trainer-func
2 results

analysis-depot

  • Clone with SSH
  • Clone with HTTPS
  • Analysis Depot

    The Analysis Depot is a repository for Jupyter notebooks we initially used to analyze subtasks and/or possible debugging for the actual code later run on the HPC Cluster. It now houses helpful and interesting in-depth descriptions, analyses, and visualizations of our implementations, training, sampling, experiments, and evaluation of our diffusion models and their neural backbone.

    Background

    Diffusion models (DMs) are a class of generative models that offer a unique approach to modeling complex data distributions by simulating a stochastic process, known as a diffusion process, that gradually transforms data from a simple initial distribution into a complex data distribution. More specifically, the simple distribution is given by Gaussian Noise which is iteratively denoised into coherent images through modeling the data distribution present in the training set.

    Overview

    The repository is divided into three sections pertaining to the type of diffusion model, namely, the unconditional, conditional, and latent DM. We train them on a variety of datasets to perform various generative tasks, covering unconditional, class-conditional image generation, inpainting, and latent Super-Resolution.

    Each of these sections contains notebooks with explanations and equations for the class of the DM, the neural network UNet architecture backbone, dataloading, sampling, evaluation, etc. They provide our thought process and further analysis on our implementation.

    Trained Models

    All our models share AdamW as the optimizer, CosineAnnealingLR as the scheduler (starting at a learning rate of 0.0001 and decreasing to an eta_min of 1e-10), and are trained jointly with an EMA model for a decay value of 0.9999. The diffusion process is performed on a Markov chain of length T=1000.

    Hyperparam. LHQ-UDM Bottleneck No Attention Celeb-UDM Cosine Noise True Variance Class-CDM No CFG Inpaint-CDM Small-CDM LDM-16 LDM-8
    Task Uncond. Uncond. Uncond. Uncond. Uncond. Uncond. Class Cond. Class Cond. Inpainting Inpainting SuperRes SuperRes
    Dataset LHQ LHQ LHQ CelebAHQ CelebAHQ CelebAHQ AFHQ AFHQ LHQ LHQ LHQ LHQ
    Split 80-20 80-20 80-20 90-10 90-10 90-10 90-10 90-10 90-10 90-10 90-10 90-10
    Resolution 128^2px 128^2px 128^2px 128^2px 128^2px 128^2px 128^2px 128^2px 128^2px 128^2px 512^2px 512^2px
    Noise β_t linear linear linear linear cosine linear linear linear linear linear linear linear
    Variance σ^2 same same same same same true same same same same same same
    CFG - - - - - - yes no - - - -
    VQGAN-f - - - - - - - - - - 16 8
    z-shape - - - - - - - - - - (256,32,32) (256,64,64)
    Parameters 37M 37M 34M 37M 37M 37M 37M 37M 37M 11M 496M 137M
    Channel Mults. [1,2,4,4,8] [1,2,4,8,10] [1,2,4,4,8] [1,2,4,4,8] [1,2,4,4,8] [1,2,4,4,8] [1,2,4,4,8] [1,2,4,4,8] [1,2,4,4,8] [1,2,2,2,4] [1,2,4,4,8] [1,2,2,4,0]
    Attention yes yes no yes yes yes yes yes yes yes no yes
    RF* per Block 7x7 3x3 7x7 7x7 7x7 7x7 7x7 7x7 7x7 7x7 7x7 7x7
    Batch Size 32 32 32 32 32 32 32 32 32 32 6 8
    Iters. 450K 225K 225K 506K 506K 506K 445K 468K 427K 225K 1.75M 1M
    Epochs 200 100 100 600 600 600 950 1000 190 100 130 100
    Cosine Steps^ 2 1 1 3 1 1 2 2 2 1 1 1

    *RF pertains to the receptive field; ^Cosine Steps represents the number of times the learning rate undergoes gradual reduction through cosine annealing.