What are the file size options for "find . -size" command?
I found out that to look for file size in bytes, I use ‘c’.
So I can look for file size with 1000 bytes by using: find . -size 1000c
But what about different kind of size such as Mb, Gb or even bits?
What character or letters do I need to use?
POSIX only specifies no suffix or a c
suffix. With no suffix, values are interpreted as 512-byte blocks; with a c
suffix, values are interpreted as byte counts, as you’ve determined.
Some implementations support more suffixes; for example GNU find
supports
b
for 512-byte blocksc
for bytesw
for 2-byte wordsk
for kibibytesM
for mebibytesG
for gibibytes
To add to what Stephen Kitt mentioned, beware that gnu find rounds up the size to the specified granularity before comparing!
If you do
truncate --size=1000 dummy_file_1000
truncate --size=1024 dummy_file_1024
then
find . -size 1k
find . -size 1024c
will not give the same result!
See find command: -size behavior
In short – find . -size 1k
will list every file with size∈[1,1024], whereas find . -size 1024c
will only list files where the actual size is exactly 1024 bytes.
find . -size 1000c # files whose size¹ is exactly 1000 bytes (not characters)
find . -size -1000c # strictly less than 1000 bytes (0 - 999)
find . -size +1000c # strictly more than 1000 bytes (1001 - ∞)
Then, using POSIX sh
syntax, you can do:
EiB=$((1024*(PiB=1024*(TiB=1024*(GiB=1024*(MiB=1024*(KiB=1024)))))))
EB=$((1000*( PB=1000*( TB=1000*( GB=1000*( MB=1000*( kB=1000)))))))
find . -size "$(( 12 * GiB ))c" # exactly 12GiB (12,884,901,888 bytes)
find . -size "$(( 12 * GB ))c" # exactly 12GB (12,000,000,000 bytes)
find . -size "-$(( 12 * GB ))c" # 0 - 11,999,9999,999 bytes
...
Without the c
suffix, beware the behaviour can be surprising:
find . -size 1000 # files whose size, in number of 512-byte units (rounded *up*)
# is 1000. So, that's file whose size in bytes ranges from
# 1000*512-511 (999*512+1) to 512*1000
find . -size -1000 # files whose size is 999*512 bytes or less
find . -size +1000 # files whose size is 1000*512+1 bytes or more
That’s it for the POSIX specification of the find
utility.
Now, various find
implementations support additional suffixes but beware the same suffixes can be interpreted differently by different implementations.
As noted by @StephenKitt, GNU find
supports cwbkMG
for byte, word, 512-byte unit, kibibyte, mebibyte, gibibyte, but it behaves like POSIX find
requires in that find . -size -12G
for instance is not the same as our find . -size "-$((12 * GiB))c"
from above as that’s files whose size in number of gibibyte (rounded up) is strictly less than 12, so files that are 11GiB or less.
For instance, find . -size -1G
only finds empty files (files of size 0). A one byte file is considered to be 1GiB as sizes are rounded up to the next GiB.
busybox find
supports cwbk
suffixes but interprets them differently from GNU find
. It’s also currently not POSIX compliant for its handling of sizes without suffixes.
For busybox find
, find . -size -12G
is like find . -size "-$(( 12 * GiB ))c"
, and find . -size -1
is for sizes ranging from 0 to 511 instead of just 0.
toybox find
(as found on Android for instance) behaves like busybox find
in that regard (and is also not POSIX compliant). Another difference is that suffixes are case insensitive there and TPE
for tebibyte, pebibyte and exbibyte are also supported and a d
(decimal) additional suffix can be used to specify that the units are powers of 1000 rather than 1024. For instance -size 1kd
finds files that are exactly 1000 bytes (1 kilobyte) instead of 1024 bytes (1 kibibyte) for -size 1k
.
In toybox find
, the suffix handling is done as part of its atolx()
function which is not only used for find
. Note however that since that supports 0xffff
hexadecimal numbers, there’s a conflict for cbedCBED
that are also hexadecimal digits. -size -0x2c
is not for less than 0x2 bytes, but for less than 0x2c (44) 512-byte units. And -size 010c
is treated as -size 8c
(octal), another POSIX non-conformance.
FreeBSD/DragonFly BSD find
support ckMGTP
(not bwE
) but while it behaves as required by POSIX without suffix, it behaves like busybox/toybox and not GNU find
when there’s a suffix².
sfind
or the find
builtin of the bosh shell behave like FreeBSD’s except suffixes are case insensitive and bwE
are supported and octal/decimal numbers and some product arithmetic expressions (such as 6x12x8k
) are accepted.
As far as I can tell, all of the OpenBSD, NetBSD, Illumos, Solaris, AIX, HP/UX only support no-suffix for 512-byte units or c
for byte as POSIX required.
A summary table:
Traditional/POSIX | GNU | FreeBSD | sfind | busybox | toybox | |
---|---|---|---|---|---|---|
suffixes | c | cwbkMG | ckMGTP | cwbkmgtpeCWBKMGTPE | cwbk | cwbkmgtpeCWBKMGTPE (+d) |
number format | decimal | decimal | decimal | dec/oct/hex/expr | decimal | dec/oct/hex |
-size $n |
($n-1)*512+1 .. $n*512 | ($n-1)*512+1 .. $n*512 | ($n-1)*512+1 .. $n*512 | ($n-1)*512+1 .. $n*512 | $n*512 | $n*512 |
-size -$n |
0 .. ($n-1)*512 | 0 .. ($n-1)*512 | 0 .. ($n-1)*512 | 0 .. ($n-1)*512 | 0 .. $n*512-1 | 0 .. $n*512-1 |
-size +$n |
($n*512)+1 .. ∞ | ($n*512)+1 .. ∞ | ($n*512)+1 .. ∞ | ($n*512)+1 .. ∞ | ($n*512)+1 .. ∞ | ($n*512)+1 .. ∞ |
-size ${n}c |
$n | $n | $n | $n | $n | $n |
-size -${n}c |
0 .. $n-1 | 0 .. $n-1 | 0 .. $n-1 | 0 .. $n-1 | 0 .. $n-1 | 0 .. $n-1 |
-size +${n}c |
$n+1 .. ∞ | $n+1 .. ∞ | $n+1 .. ∞ | $n+1 .. ∞ | $n+1 .. ∞ | $n+1 .. ∞ |
-size $n$unit |
N/A | ($n-1)*$unit+1 .. $n*$unit | $n*$unit | $n*$unit | $n*$unit | $n*$unit |
-size -$n$unit |
N/A | 0 .. ($n-1)*$unit | 0 .. $n*$unit-1 | 0 .. $n*$unit-1 | 0 .. $n*$unit-1 | 0 .. $n*$unit-1 |
-size +$n$unit |
N/A | $n*$unit+1 .. ∞ | $n*$unit+1 .. ∞ | $n*$unit+1 .. ∞ | $n*$unit+1 .. ∞ | $n*$unit+1 .. ∞ |
So, in short, for portability, your best bet is to use the c
suffix, decimal only numbers without leading zeros and compute the units manually.
For completeness, the L
glob qualifier of zsh
(with kmgt
case insensitive, but pP
is for 512-byte unit, not pebibyte) behaves like POSIX/GNU find
(*(LM-12)
expands to files whose size is in-between 0 and 11 mebibytes for instance).
¹ That’s the size as reported in the st_size
attribute of the structure returned by lstat()
whose meaning for non-regular files can vary between system.
² There’s the same kind of distinction in FreeBSD find
/sfind
for the -Xtime
predicates where for instance -mtime +1
matches on files that are 2 days old or older (age 86400*2 – ∞) while -mtime +1d
matches on files that are more than one day old (age 86400.000000001 – ∞). With GNU find
, see also ! -newermt -1day
(or 1 day ago
or yesterday
).