Multiples of bytes
A byte is a byte but a megabyte may not be the size you think it is. There has historically been different standards, a megabyte could be
1 000 000 (106 or 10002 B) or
1 048 576 bytes (220 or 10242 B).
Serious scientific countries use SI standards (systeme internationale), backwater places like UK and US stick with an inferior, outdated imperial system of units dating back to their long gone heyday.
The modern IEC standard refers to
1 048 576 as a "mebibyte" (MiB) to differentiate it from the decimal
1 000 000 (106 B) megabyte (MB). The IEC definitions are widely accepted as modern standards. Linux tools use
G to specify what IEC defines as kibibytes, mebibytes and gibibytes.
Operating systems begun describing 1024 bytes as a kilobyte in the 1970s. Computers were 8-bit at the time and there was no simple term to describe 1024 bytes. Kilo was picked because it was close enough. A kilo is 1000 grams, not 1024, but 1K or KB was used to describe 1KB anyway. The terms megabyte and gigabyte became commonly used to describe larger sizes. JEDEC formalized these commonly used terms in standards like
The International Electrotechnical Commission (IEC) re-defined kilobytes, megabytes and gigabytes in "Amendment 2 to IEC International Standard IEC 60027-2" in 1998. Their standard requires one kilobyte strictly mean 1000 bytes. They introduced new words to describe the old kilobyte, megabyte and gigabyte as well as new terms to describe larger sizes. The traditional kilobyte (KB) became a kibibyte (KiB), a megabyte (MB) became a mebibyte (MiB) and a gigabyte (GB) became a gibibyte (GiB). A vast majority of the corporations in the computer industry went along with IEC's standard.
|Quantities of bytes|
GNU/Linux tools are typically able to accept powers of both 1000 (
KB) and 1024 (
KiB) specify sizes.
"The length and offset arguments may be followed by the multiplicative suffixes KiB (=1024), MiB (=1024*1024), and so on for GiB, TiB, PiB, EiB, ZiB, and YiB (the "iB" is optional, e.g., "K" has the same meaning as "KiB") or the suffixes KB (=1000), MB (=1000*1000), and so on for GB, TB, PB, EB, ZB, and YB."
fallocate test.file -l 1K will create a 1024 byte file (specifying with lowercase, as in
fallocate test.file -l 1k, will also work).
fallocate test.file -l 1KB will create a 1000 byte file.
du, from coreutils, is commonly used to show a close estimation of how much space files or directories use.
du will show powers of 1024 when the
-h option is used.
--si can be used to show powers of 1000. Running
du -h a file which is
8689264 bytes will show
du --si on the same file will claim it is
8.7M. It is interesting to note that
du will use the
M suffix in both cases.
Beware Of And Prepared For Some Confuse
mdadm, which is used to manage Linux software RAID arrays, has a manual which, as of v4.1-rc2, states:
A suffix of 'K', 'M' or 'G' can be given to indicate Kilobytes, Megabytes or Gigabytes respectively."
G will always (in any Linux tool we checked, anyway) use sizes in powers of 1024 and
mdadm is no exception. The manual describes sizes specified by
M (10242) as a switch using sizes in "Megabytes". This is correct according to the historical JEDEC standard where one megabyte is 1048576 (10242) bytes but it is incorrect according to the IEC standard where a megabyte is defined as 10002 and 10242 is called a mebibyte.
There is a lot of confuse like that in GNU/Linux manual pages as well as HOWTOs and documentation. As a general rule of thumb:
T and so on will always mean kibibyte (1024), mebibyte (10242), gibibyte (10243), tebibyte (10244) and so on even if the manual says
M = megabyte. Adding a
B as in
TB will make command line tools use IEC power of ten sizes specifying a IEC kilobyte (1000B), megabyte (10002B), gigabyte (10003B), terabyte (10004B) and so on.