Stable Diffusion

4309 readers

4 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 1 year ago

MODERATORS

db0@lemmy.dbzer0.com

Ertugrul/Qwen2-VL-7B-Captioner-Relaxed (huggingface.co)

submitted 1 month ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

1 comments fedilink hide all child comments

Qwen2-VL-7B-Captioner-Relaxed is an instruction-tuned version of Qwen2-VL-7B-Instruct, an advanced multimodal large language model. This fine-tuned version is based on a hand-curated dataset for text-to-image models, providing significantly more detailed descriptions of given images.

top 1 comments

sorted by: hot top controversial new old

[–] j4k3@lemmy.world 3 points 1 month ago* (last edited 1 month ago)

You using this in a toolchain? I haven't tried any of the Qwen models yet, or Yi for that matter. I tried at one point early on, but they were not working well with my stuff and I had no complaints with Mistral stuff. I like some underlying things with a MoE for speed and underlying entity/realm stuff I can access in my favorite.

I'm curious if anyone has constructive contextual feedback about what makes these unique or worth exploring.