I’ve used an image upscaling on a 3060ti with 8 gigs of vram. It works ok, but does limit how much it can do at one time. Long as you’re ok with letting it run longer I’d imagine it would work on a text mode as wel
Self Hosted - Self-hosting your services.
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules
- No harassment
- crossposts from c/Open Source & c/docker & related may be allowed, depending on context
- Video Promoting is allowed if is within the topic.
- No spamming.
- Stay friendly.
- Follow the lemmy.ml instance rules.
- Tag your post. (Read under)
Important
Beginning of January 1st 2024 this rule WILL be enforced. Posts that are not tagged will be warned and if not fixed within 24h then removed!
- Lemmy doesn't have tags yet, so mark it with [Question], [Help], [Project], [Other], [Promoting] or other you may think is appropriate.
Cross-posting
- !everything_git@lemmy.ml is allowed!
- !docker@lemmy.ml is allowed!
- !portainer@lemmy.ml is allowed!
- !fediverse@lemmy.ml is allowed if topic has to do with selfhosting.
- !selfhosted@lemmy.ml is allowed!
If you see a rule-breaker please DM the mods!
So I’m no expert at running local LLMs, but I did download one (the 7B vicuña model recommended by the LocalLLM subreddit wiki) and try my hand at training a LoRA on some structured data I have.
Based on my experience, the VRAM available to you is going to be way more of a bottleneck than PCIe speeds.
I could barely hold a 7B model in 10 GB of VRAM on my 3080, so 8 GB might be impossible or very tight. IMO to get good results with local models you really have large quantities of VRAM and be using 13B or above models.
Additionally, when you’re training a LoRA the model + training data gets loaded into VRAM. My training dataset wasn’t very large, and even so, I kept running into VRAM constraints with training.
In the end I concluded that in the current state, running a local LLM is an interesting exercise but only great on enthusiast level hardware with loads of VRAM (4090s etc).
I see. I do not want to spend $1000+ on a GPU, I suppose this will have to wait a few years
Yup; hopefully there are some advances in the training space, but I’d guess that having large quantities of VRAM is always going to be necessary in some capacity for training specifically.
I hope some GPU manufacturer starts allowing removable RAMs. 4 x 8 GB DDR5 might not be too bad given PCIe speeds aren't a bottleneck. If I could upgrade the RAM to 64 GB later, I'm ready to give $10k at 3080 level perf. Intel ARC people I hope you are already doing this!
I don’t know anything about GPU design but expandable VRAM is a really interesting idea. Feels too consumer friendly for Nvidia and maybe even AMD though.
Yup; hopefully there are some advances in the training space, but I’d guess that having large quantities of VRAM is always going to be necessary in some capacity for training specifically.