Multiples of bytes
A byte is a byte but a megabyte may not be the size you think it is. There has historically been different standards, a megabyte could be 1 000 000
(106 or 10002 B) or 1 048 576
bytes (220 or 10242 B). The modern IEC standard refers to 1 048 576
as a "mebibyte" (MiB) to differentiate it from the decimal 1 000 000
(106 B) megabyte (MB). The IEC definitions are widely accepted as modern standards. Linux tools use K
, M
and G
to specify what IEC defines as kibibytes, mebibytes and gibibytes.
Historical confusion[edit]
Operating systems begun describing 1024 bytes as a kilobyte in the 1970s. Computers were 8-bit at the time and there was no simple term to describe 1024 bytes. Kilo was picked because it was close enough. A kilo is 1000 grams, not 1024, but 1K or KB was used to describe 1KB anyway. The terms megabyte and gigabyte became commonly used to describe larger sizes. JEDEC formalized these commonly used terms in standards like 100B.01
.
Current Standards[edit]
The International Electrotechnical Commission (IEC) re-defined kilobytes, megabytes and gigabytes in "Amendment 2 to IEC International Standard IEC 60027-2" in 1998. Their standard requires one kilobyte to strictly mean 1000 bytes. They introduced new words to describe the old kilobyte, megabyte and gigabyte as well as new terms to describe larger sizes. The traditional kilobyte (KB) became a kibibyte (KiB), a megabyte (MB) became a mebibyte (MiB) and a gigabyte (GB) became a gibibyte (GiB). A vast majority of the corporations in the computer industry went along with IEC's standard.
Quantities of bytes | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
GNU/Linux[edit]
GNU/Linux tools are typically able to accept powers of both 1000 (KB
) and 1024 (K
or KiB
) when sizes are specified.
fallocate
, from util-linux, can be used to preallocate space for a file (create a file of a given size). The fallocate
manual[1] describes its length argument as:
"The length and offset arguments may be followed by the multiplicative suffixes KiB (=1024), MiB (=1024*1024), and so on for GiB, TiB, PiB, EiB, ZiB, and YiB (the "iB" is optional, e.g., "K" has the same meaning as "KiB") or the suffixes KB (=1000), MB (=1000*1000), and so on for GB, TB, PB, EB, ZB, and YB."
fallocate test.file -l 1K
will create a 1024 byte file (specifying with lowercase, as in fallocate test.file -l 1k
, will also work).
fallocate test.file -l 1KB
will create a 1000 byte file.
du
, from coreutils, is commonly used to show a close estimation of how much space files or directories use. du
will show powers of 1024 when the -h
option is used. --si
can be used to show powers of 1000. Running du -h
a file which is 8689264
bytes will show 8.3M
while du --si
on the same file will claim it is 8.7M
. It is interesting to note that du
will use the M
suffix in both cases.
Beware Of And Prepared For Some Confuse[edit]
mdadm
, which is used to manage Linux software RAID arrays, has a manual which, as of v4.1-rc2, states[2]:
"-z, --size=
A suffix of 'K', 'M' or 'G' can be given to indicate Kilobytes, Megabytes or Gigabytes respectively."
K
, M
and G
will always (in any Linux tool we checked, anyway) use sizes in powers of 1024 and mdadm
is no exception. The manual describes sizes specified by M
(10242) as a switch using sizes in "Megabytes". This is correct according to the historical JEDEC standard where one megabyte is 1048576 (10242) bytes but it is incorrect according to the IEC standard where a megabyte is defined as 10002 and 10242 is called a mebibyte.
There is a lot of confuse like that in GNU/Linux manual pages as well as HOWTOs and documentation. As a general rule of thumb: K
, M
, G
, T
and so on will always mean kibibyte (1024), mebibyte (10242), gibibyte (10243), tebibyte (10244) and so on even if the manual says M
= megabyte. Adding a B
as in KB
, MB
, GB
, TB
will make command line tools use IEC power of ten sizes specifying a IEC kilobyte (1000B), megabyte (10002B), gigabyte (10003B), terabyte (10004B) and so on.
Enable comment auto-refresher