this post was submitted on 24 Oct 2023
29 points (100.0% liked)

Stable Diffusion

4297 readers
12 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 1 year ago
MODERATORS
 

Preface

Since the introduction of Stable Diffusion 1.5 by StabilityAI, the ML community has eagerly embraced the open-source model. In August, we introduced the 'Segmind Distilled Stable Diffusion' series with the compact SD-Small and SD-Tiny models. We open-sourced the weights and code for distillation training, and the models were inspired by groundbreaking research presented in the paper "On Architectural Compression of Text-to-Image Diffusion Models". These models had 35% and 55% fewer parameters than the base model, respectively, while maintaining comparable image fidelity.

With the introduction of SDXL 1.0 in July, we saw the community moving to the new architecture due to its superior image quality and better prompt coherence. In our effort to make generative AI models faster and more affordable, we began working on a distilled version of SDXL 1.0. We were successful in distilling the SDXL 1.0 to half it's size. Read on to learn more about our SSD-1B model.

Blog post: https://blog.segmind.com/introducing-segmind-ssd-1b/

Model: https://huggingface.co/segmind/SSD-1B

Demo: https://huggingface.co/spaces/segmind/Segmind-Stable-Diffusion

top 5 comments
sorted by: hot top controversial new old
[–] poVoq@slrpnk.net 3 points 1 year ago (2 children)

Bit of a shame that they didn't manage to fit it into 12GB vRAM, so you still need a 16GB vRAM GPU.

[–] Zarxrax@lemmy.world 4 points 1 year ago (2 children)

What do you mean? I run the normal SDXL on 12gb vram.

[–] poVoq@slrpnk.net 2 points 1 year ago

hmm, odd. The linked explanation says that in operation SDXL needs 15GB or so vRAM (and this slimmed down version just above 12GB). Maybe 12GB is only possible at lower resolutions?

[–] vluz@kbin.social 2 points 1 year ago

I do SDXL generation in 4GB at extreme expense of speed, by using a number of memory optimizations.
I've done this kind of stuff since SD 1.4, for the fun of it. I like to see how low I can push vram use.

SDXL takes around 3 to 4 minutes per generation including refiner but it works within constraints.
Graphics cards used are hilariously bad for the task, a 1050ti with 4GB and a 1060 with 3GB vram.

Have an implementation running on the 3GB card, inside a podman container, with no ram offloading, 1 vcpu and 4GB ram.
Graphical UI (streamlit) run on a laptop outside of server to save resources.

Working on a example implementation of SDXL as we speak and also working on SDXL generation on mobile.
That is the reason I've looked into this news, SSD-1B might be a good candidate for my dumb experiments.