Member-only story
How does diffusion generate images? — An iterated blending view of diffusion
In this article we aim to obtain a simpler view of diffusion through blending.
Diffusion based generative imaging is a class of generative modeling. The famous systems, DallE-2, Imagen, Stable Diffusion are all based on this technique. An example is shown below.
These systems enable us to use a textual description to generate a large variety of images. Arthur Brisbane said -’a picture is worth a thousand words’ in the early twentieth century. With the use of diffusion models, we can now indeed achieve the reverse. That is, by a few words of textual description, we can now generate an image.
Scott Reed et al first showed how to generate realistic images by a textual description. They did this in their work ‘Generative Adversarial Text to Image Synthesis’, ICML 2016'.
Yet, we are now able to achieve this in a much better way using diffusion. The images generated have more variety, are of higher resolution and more photorealistic. We now consider diffusion-based models to be better than other generative techniques. Through this article we intend to better understand diffusion as iterated blending