this post was submitted on 06 Jan 2025
779 points (100.0% liked)
TechTakes
1538 readers
1 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Okay, explain. What kinds of low hanging fruit?
quants are pretty basic. switching from floats to ints (faster instruction sets) are the well known issues. both those are related to information theory, but there are other things I legally can't mention. shrug. suffice to say the model sizes are going to be decreasing dramatically.
edit: the first two points require reworking the base infrastructure to support which is why they havent hit widespread adoption. but the research showing that 3 bits is as good as 64 is intuitive once you tie the original inspiration for some of the AI designs. that reduction alone means you can get 21x reduction in model size is pretty solid.
hahahaha fuck off with this. no, the horseshit you’re fetishizing doesn’t fix LLMs. here’s what quantization gets you:
anyway speaking of basic information theory:
lol
It's actually super easy to increase the accuracy of LLMs.
I left out all the other details because it's pretty intuitive why it works if you understand why floats have precision issues.
decimal is a severely underappreciated library