this post was submitted on 27 Dec 2023
126 points (68.3% liked)

Technology

58108 readers
5153 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.

This one is about why it was a mistake to call 1024 bytes a kilobyte. It's about a 20min read so thank you very much in advance if you find the time to read it.

Feedback is very much welcome. Thank you.

(page 2) 50 comments
sorted by: hot top controversial new old
[–] smokin_shinobi@lemmy.world 7 points 8 months ago (18 children)

I was taught 1024 in my tech school. So I won’t ever refer to it as 1000 instead 1024. Not that it seems even remotely relevant though.

load more comments (18 replies)
[–] smo@lemmy.sdf.org 7 points 8 months ago (3 children)

This has been my pet rant for a long time, but I usually explain it .. almost exactly the other way around to you.

You can essentially start off with nothing using binary prefixes. IBM's first magnetic harddrive (the IBM 350 - you've probably seen it in the famous "forklifting it into a plane" photo) stored 5 million characters. Not 5*1024*1024 characters, 5,000,000 characters. This isn't some consumer-era marketing trick - this is 1956, when companies were paying half a million dollars a year (2023-inflated-adjusted) to lease a computer. I keep getting told this is some modern trick - doesn't it blow your mind to realise hdd manufacturers have been using base10 for nearly 70 years? Line-speed was always ~~a lie~~ base 10, where 1200 baud laughs at your 2^n fetish (and for that matter, baud comes from telegraphs, and was defined before computers existed), 100Mbit ethernet runs on a 25MHz clock, and speaking of clocks - kHz, MHz, MT/s, GT/s etc are always specified in base 10. For some reason no-one asks how we got 3GHz in between 2 & 4GHz CPUs.

As you say, memory is the trouble-maker. RAM has two interesting properties for this discussion. One is that it heavily favours binary-prefixed "round numbers", traditionally because no-one wanted RAM with un-used addresses because it made address decoding nightmarish (tl;dr; when 8k of RAM was usually 8x1k chips, you'd use the first 3 bits of the address to select the chip, and the other 10 bits as the address on the chip - if chips didn't use their entire address space you'd need to actually calculate the address map, and this calculation would have to run multiples of times faster than the cpu itself) . The second, is that RAM was the first place non-CSy types saw numbers big enough for k to start becoming useful. So for the entire generation that started on microcomputers rather than big iron, memory-flavoured-k were the first k they ever tasted.

I mean, hands up who had a computer with 8-64k of RAM and a cassette deck. You didn't measure the size of your stored program in kB, but in seconds of tape.

This shortcut than leaked into filesystems purely as an implementation detail - reading disk blocks into memory is much easier if you're putting square pegs into square holes. So disk sectors are specified in binary sizes to enable them to fit efficiently into memory regions/pages. For example, CP/M has a 128-byte disk buffer between 0x080 and 0x100 - and its filesystem uses 128-byte sectors. Not a coincidence.

This is where we start getting into stuff like floppy disk sizes being utter madness. 360k & 720k were 720 and 1440 512-byte sectors. When they doubled up again, we doubled 2800 512-byte sectors gave us 1440k - and because nothing is ever allowed to make sense (or because 1.40625M looks stupid), we used base10 to call this 1.44M.

So it's never been that computers used 1024-shaped-k's. It should be a simple story of "everything uses 1,000s, except memory because reasons". But once we started dividing base10-flavoured storage devices into base2-flavoured sectors, we lost any hope of this ever looking logical.

[–] smo@lemmy.sdf.org 6 points 8 months ago* (last edited 8 months ago)

aside: the little-k thing. SI has a beautifully simple rule, capital letters for prefixes >1, small letters for prefixes <1. So this disambiguates between a millivolts (mV) and megavolts (MV).

But, and there's always a but. The kilogram was the first SI unit, before they'd really thought it through. So we got both a lower-case k breaking such a beautifully simple rule, and the kilogram as a base unit instead of a gram. The Kilogram is metric's "screw it, we'll do it live".

Luckily this is almost a non-issue in computing as a fraction of a bit never shows up in practice. But! If you had a system that took 1000 seconds to transfer one bit, you could call that a millibit per second, or mbps, and really mess things up.

load more comments (2 replies)
[–] corsicanguppy@lemmy.ca 4 points 8 months ago

WD needed to sell a drive with more advertised space than real space.

[–] TigrisMorte@kbin.social 3 points 8 months ago (2 children)

It is only a mistake from a Human PoV. It is more efficient for the chip since 1000 bytes and 1024 bytes take up the same space. But Humans find anything not base 10 difficult.

load more comments (2 replies)
[–] ElderWendigo@sh.itjust.works 3 points 8 months ago* (last edited 8 months ago) (3 children)
  • Kilobyte is 2^10 bytes or about a thousand bytes within a few reasonably significant digits.
  • Megabyte is 2^20 bytes or about a thousand megabytes within a few reasonably significant digits.
  • Terabyte is 2^30 bytes or about a a thousand megabytes within a few reasonably significant digits.

The binary storage is always going to be a translation from a binary base to a decimal equivalent. So the shorthand terms used to refer to a specific and long integer number should comes as absolutely no surprise. And that's just it; they're just a shorthand, slang jargon that caught on because it made sense to anyone that was using it.

Your whole article just makes it sound like you don't actually understand the math, the way computers actually work, linguistics, or etymology very well. But you're not really here for feedback are you. The whole rant sounds like a reaction to a bad grade in a computer science 101 course.

load more comments (3 replies)
[–] 56_@lemmy.ml 3 points 8 months ago (2 children)

Unlike many comments here, I enjoyed reading the article, especially the parts in the "I don’t want to use gibibyte!" chapter, where you explain that this (the pedantry) is important in technical and formal situations (such as documentation). Seeing some of the comments here, I think it would have helped to focus on this aspect a bit more.

I also liked the extra part explaining the reasoning for using the Nokia E60.

I don't quite agree with the recommendation to use base 10 SI units where neither KiB or kB would result in nice numbers. I don't see why base 10 should have an influence on computers, and I think it makes more sense to stick to a single unit, such as KiB.

The reasons I have this opinion are probably to do with:

  • My computer has shown me values using KiB, Gib, etc for years - I think it's a KDE default - so I'm already used to the concept of KiB being different from kB.
  • I dislike the concept of base 10 in general. I like the idea of using base 16 universally (because computers. Base 12 is also valid in a less computer-dominant society). I therefore also think 1024 is a silly number to use, and we should measure memory in multiples of 2^8 or 2^16...

p.s, I agree with other commenters that your comments starting with "Pretty obvious that you didn’t read the article." or similar are probably not helping your case... I understand that some comments here have been quite frustrating though.

[–] chitak166@lemmy.world 3 points 8 months ago (1 children)

I dislike the concept of base 10 in general.

You're not human.

load more comments (1 replies)
load more comments (1 replies)
load more comments
view more: ‹ prev next ›