overview for gh0stcassette

*Permanently Deleted* in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 2 points 1 year ago (3 children)

Please do!

Reddit CEO Digs In Heels As User Outrage Engulfs Website in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 8 points 1 year ago (1 children)

Maybe, but I think people overstate this. Reddit's desktop UI and official app still confuse and upset me. Frankly the on-boarding to Lemmy is easier if anything

Reddit CEO Digs In Heels As User Outrage Engulfs Website in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 13 points 1 year ago (3 children)

I would expect the big jump to come when people who are barely engaged with this whole thing try to open Apollo or Sync or whatever in a few weeks, seeing it doesn't work, then spending 5 minutes trying to use the official app before getting frustrated and googling "reddit alternative"

Roleplay LLMs? in c/localllama@sh.itjust.works

[–] gh0stcassette@lemmy.world 2 points 1 year ago

You can use 3.5turbo via poe.com through sillytavern for free

llama.cpp for GPU only in c/localllama@sh.itjust.works

[–] gh0stcassette@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

What use case would that be?

I can get like 8 tokens/s running 13b models in q_3_k_L quantization on my laptop, about 2.2 for 33b, and 1.5 for 65b (I bought 64gb of RAM to be able to run larger models lol). 7B was STUPID fast because the entire model fits inside my (8gb) GPU, but 7b models mostly suck (wizard-vicuna-uncensored is decent, every other one I've tried was Not).

Hello World in c/localllama@sh.itjust.works

[–] gh0stcassette@lemmy.world 2 points 1 year ago

Adding to this: text-generation-webui (https://github.com/oobabooga/text-generation-webui) works with the latest bleeding edge llama.cpp via llama-cpp-python, and it has a nice graphical front-end. You do have a manually tell pip to install llama.cpp-python with the right compiler flags to get GPU acceleration working but the llama-cpp-python github and ooba github explain how to do this.

You can even set up GPU acceleration through metal on m1 Macs I've seen some fucking INSANE performance numbers online for the higher RAM MacBook pros (20+ tokens/sec, I think with a 33b model, but it might have been 13b, either way, impressive.)

llama.cpp for GPU only in c/localllama@sh.itjust.works

[–] gh0stcassette@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (2 children)

Llama.cpp recently added CUDA acceleration for generation (previously only ingesting the prompt was GPU accelerated), and in my experience it's faster than GPTQ unless you can fit absolutely 100% of the model in VRAM. If literally a single layer is CPU offloaded, the performance in GPTQ immediately becomes like 30-40% worse than an equivalent CPU offload with llama.cpp

Here it comes - Reddit admins taking over subs in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 0 points 1 year ago (1 children)

Before Elon, it was about half an million, now it's about 4.5 million, though about a million of the new users made an account and then immediately went back to Twitter, so it's more like 3.5 million

what's the asshole mitigation plan? in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 2 points 1 year ago

Yeah, it's basically like email. Though I imagine an instance like that would get defedded pretty quick

what's the asshole mitigation plan? in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 2 points 1 year ago

Yeah, it's a beautiful system. When all the banned Twitter Nazis moved to gab and then gab moved to Mastodon everyone immediately defedded them, it's like having a pre-curated blocklist of most of the worst people on the platform

what's the asshole mitigation plan? in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 3 points 1 year ago

Almost definitely, but no guarantee a new instance will have the same communities 1-1 though. It would be really useful for resubbing to non-local communities thought

what's the asshole mitigation plan? in c/lemmyworld@lemmy.world

[–] gh0stcassette@lemmy.world 4 points 1 year ago (1 children)

Self-hosting might be the only way to do this, I imagine any instance with enough users will have people wanting to post locally