this post was submitted on 22 Oct 2023
314 points (98.8% liked)
Linux
48133 readers
1170 users here now
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
As long as this allows running local, free software models I don't see the drawback of including this.
My main issue with ChatGPT and similar products is that they use my data to train their models. Running a model locally (like Llama) solves this problem, but running LLMs require extremely powerful GPUs, specially the bigger ones like Llama 70b.
So dedicated hardware for this is a nice thing for those that want it.
It requires powerful gpus yes but not always. It depends a lot on how fast you want it to run. Microsoft and openai need powerful ai gpus because they have a lot of requests, data and want it to go fast. The dataset may also require to be stored in memory or gpu memory for fast access and use by the ai.
For Llama, it has been released as open source. And what is amazing about open source, is the community. A Llama entirely in c++ has been created https://github.com/ggerganov/llama.cpp .
And someone even managed to make it run, fast enough, on a phone with 8gb of available ram https://github.com/ggerganov/llama.cpp/discussions/750 . Tho with a smaller dataset.
I honestly don't see the problem with them using your data for training.
You help them with training and you get to use their service for free.
Btw at least with chatGPT you can turn that off.
sure, a company that has used petabytes of data they do not own any rights of to train their models are totally excluding their own customers data when turning a switch off.
yeah, I totally trust OpenAI and Microsoft with my data. It's not like Microsoft is spying on me after turning of Windows telemetry either.
I usually don't care about copyright so maybe that's why I don't care about this.
I don't see the problem. I'm happy to help them train, as long as it's not used for marketing.
They tell you to not give them private data, so don't.