Re: Mass storage sizes

David Wright Thu, 09 Jan 2025 20:42:02 -0800

On Fri 10 Jan 2025 at 02:46:13 (+0100), Urs Thuermann wrote:
> Michael Stone <mst...@debian.org> writes:
> > On Tue, Jan 07, 2025 at 02:59:47PM -0700, Charles Curley wrote:
> > >Mr. Tarsnap forgets something. The reason disks are addressed in powers
> > >of two has to do with mathematics. Every hard and floppy disk out there
> > >has flaws. To get around that, data is divided into sectors, and
> > >checksums calculated. Done right, this allows for error correction for
> > >small flaws. The math works out better if you do it in chunks that are
> > >integer powers of two. So floppy disks have sectors of 256 octets, and
> > >their attendant checksums. Modern hard drives schlep data in chunks of
> > >4096 (2^12) octets. And bytes these days are eight bits.
> > 
> > The thing is, nobody cares about all that. It's an implementation
> > detail that matters not to any normal person. Normal people care about
> > things like "when I just look at the first couple of numbers of the
> > size in bytes, is it the same thing as the size in <insert large unit>
> > or do I need to do a bunch of math to answer a simple question?"
> > 
> > >GB or GiB? I don't care, just be clear which one you are using.
> > 
> > Which nobody is. The right answer is to stop using power of two units
> > because they are pointless.
> 
> No matter how often you repeat this, it's still wrong.  Using power of
> two units is quite useful.  For example, my computers had 5.12 kB,
> 65.356 kB, 16.777216 MB, 67.108864 MB, 268.435456 MB, 1.073741824 GB,
> and 8.589934592 GB of RAM.  Perfectly correct, but I prefer to say
> they had 5 kiB, 64 kiB, 16 MiB, 64 MiB, 256 MiB, 1 GiB, and 8 GiB of
> RAM.  Because of technical reasons, rows and columns with m RAS and n
> CAS select lines, sizes of RAM chips are almost always 2**m * 2**n
> bytes, i.e. powers of 2.  RAM modules for the RAM slots of your
> mainboard also are sized in powers of 2, because otherwise it would be
> extremely hard to decode a memory address from the CPU into module
> slot number and offset inside the module.
> 
> Block devices like floppy disks, hard disks, SSDs, etc. also have
> block sizes which are powers of 2, like 256, 512, and 4096 bytes.
> This is not because of checksumming as was suggested in this thread,
> but because it makes it easier (e.g. for DMA controllers) to copy
> from/to pages of RAM.


I wouldn't argue with any of that, except to say that no one I knew
used any of those numbers like 16.777216MB above, but only "16MB" etc.
The problem was the lack of a distinct prefix until around the turn
of the century. But now there's no reason not to write 16MiB of
memory, even though it's acceptable to talk informally of "16 megs".

> Therefore, my floppy disks had exacty 170.75 kiB, 720 kiB, and 1.44
> MiB --- or as you would like to put it --- 174.848 kB, 737.280 kB, and
> 1.47456 MB.  So again: Are kiB, MiB, GiB, and TiB really pointless?

That's where you lose me. Blocksizes in binary powers, yes, but
there's no reason to /count blocks/ in binary. Your 1.44MiB floppy
has 1440KiB; when was 1440, or 0b10110100000, a convenient number
in binary? It's merely the product of 80 cylinders, two heads and
18 sectors per track, giving 2880 sectors of 512 bytes.

Once you move on to non-removable disks, where the disk geometry
doesn't have to be standardised for interoperability, all bets are
off. At least with CHS, you got some of the factorisation done for
you, like 14655/64/32 and 9729/255/63, and you can see they're
not based on binary powers.

> This is not really confusing, except for people who are too dumb to
> understand units and their conversions.  Granted, some confusion came
> from using k, M, G for the power-of-2 based units.  Back in the days
> when we had only kilobytes this didn't matter too much since 2**10 is
> so close to 1000 (only 2.4% more) that it was just practical to use
> uppercase K to mean "a little more than k", i.e. 2**10, but still
> speak of kilobytes.

That's news to me. Where have you seen that?

> When we reached sizes of megabytes, we couldn't
> use a "larger than M letter", so M was simply used for both, 10**6 and
> 2**20, but it was usually clear from the context, what was meant.
> Some confusion started, when some marketing people switched from using
> the then common binary units to using decimal units because it made
> their drives look larger than the competitor's drives.

I've never seen convincing evidence for this. If there is any, it's
likely to be in the filings for lawsuits—go find it.

> Only much much later someone invented the ki, Mi, Gi, ... prefixes
> to fix this and which I think helped a lot.  Only, when talking I
> never hear anyone saying kibi, mebi, gibi, tebi. so you still need to
> take the context into account to understand.

Naturally, because speech is generally bidirectional—if it's unclear,
you ask. Not so for print. (BTW, kibi is abbreviated as Ki.)

Cheers,
David.

Re: Mass storage sizes

Reply via email to