this post was submitted on 08 Jun 2023
11 points (100.0% liked)
LocalLLaMA
2244 readers
1 users here now
Community to discuss about LLaMA, the large language model created by Meta AI.
This is intended to be a replacement for r/LocalLLaMA on Reddit.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Adding to this: text-generation-webui (https://github.com/oobabooga/text-generation-webui) works with the latest bleeding edge llama.cpp via llama-cpp-python, and it has a nice graphical front-end. You do have a manually tell pip to install llama.cpp-python with the right compiler flags to get GPU acceleration working but the llama-cpp-python github and ooba github explain how to do this.
You can even set up GPU acceleration through metal on m1 Macs I've seen some fucking INSANE performance numbers online for the higher RAM MacBook pros (20+ tokens/sec, I think with a 33b model, but it might have been 13b, either way, impressive.)