Diffusion & Large Vision Models cheatsheet for Stanford's CME 296

Goal

This repository aims at summing up in the same place all the important notions that are covered in Stanford's CME 296 Diffusion & Large Vision Models course. It includes:

Generation paradigms: diffusion, score matching, convergence of diffusion methods with SDEs, flow matching
Multimodal guided generation: latent diffusion models with VAEs, Transformer-based representations, contrastive learning, self-supervised learning, guidance
Image generation architectures: Convolutions, U-Net, attention mechanism, DiT, MM-DiT
Model training: pre-training, post-training, distillation, evaluation with feature-based metrics and MLLM-as-a-Judge

Content

VIP Cheatsheet

Class website

cme296.stanford.edu

Authors

Afshine Amidi (Ecole Centrale Paris, MIT) and Shervine Amidi (Ecole Centrale Paris, Stanford University)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
en		en
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diffusion & Large Vision Models cheatsheet for Stanford's CME 296

Goal

Content

VIP Cheatsheet

Class website

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Diffusion & Large Vision Models cheatsheet for Stanford's CME 296

Goal

Content

VIP Cheatsheet

Class website

Authors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages