What are the file size options for "find . -size" command?

Question

What are the file size options for "find . -size" command?

I found out that to look for file size in bytes, I use ‘c’.

So I can look for file size with 1000 bytes by using: find . -size 1000c

But what about different kind of size such as Mb, Gb or even bits?
What character or letters do I need to use?

Asked By: Mardia

||

Source

Answer 1

POSIX only specifies no suffix or a c suffix. With no suffix, values are interpreted as 512-byte blocks; with a c suffix, values are interpreted as byte counts, as you’ve determined.

Some implementations support more suffixes; for example GNU find supports

b for 512-byte blocks
c for bytes
w for 2-byte words
k for kibibytes
M for mebibytes
G for gibibytes

Answered By: Stephen Kitt

Answer 2

To add to what Stephen Kitt mentioned, beware that gnu find rounds up the size to the specified granularity before comparing!

If you do

truncate --size=1000 dummy_file_1000
truncate --size=1024 dummy_file_1024

then

find . -size 1k 
find . -size 1024c

will not give the same result!

See find command: -size behavior

In short – find . -size 1k will list every file with size∈[1,1024], whereas find . -size 1024c will only list files where the actual size is exactly 1024 bytes.

Answered By: Popup

Answer 3

POSIXly:

find . -size  1000c # files whose size¹ is exactly 1000 bytes (not characters)
find . -size -1000c # strictly less than 1000 bytes (0 - 999)
find . -size +1000c # strictly more than 1000 bytes (1001 - ∞)

Then, using POSIX sh syntax, you can do:

EiB=$((1024*(PiB=1024*(TiB=1024*(GiB=1024*(MiB=1024*(KiB=1024)))))))
 EB=$((1000*( PB=1000*( TB=1000*( GB=1000*( MB=1000*( kB=1000)))))))

find . -size "$(( 12 * GiB ))c" # exactly 12GiB (12,884,901,888 bytes)
find . -size "$(( 12 * GB  ))c" # exactly 12GB (12,000,000,000 bytes)
find . -size "-$(( 12 * GB ))c" # 0 - 11,999,9999,999 bytes
...

Without the c suffix, beware the behaviour can be surprising:

find . -size  1000 # files whose size, in number of 512-byte units (rounded *up*)
                   # is 1000. So, that's file whose size in bytes ranges from
                   # 1000*512-511 (999*512+1) to 512*1000
find . -size -1000 # files whose size is 999*512 bytes or less
find . -size +1000 # files whose size is 1000*512+1 bytes or more

That’s it for the POSIX specification of the find utility.

Now, various find implementations support additional suffixes but beware the same suffixes can be interpreted differently by different implementations.

As noted by @StephenKitt, GNU find supports cwbkMG for byte, word, 512-byte unit, kibibyte, mebibyte, gibibyte, but it behaves like POSIX find requires in that find . -size -12G for instance is not the same as our find . -size "-$((12 * GiB))c" from above as that’s files whose size in number of gibibyte (rounded up) is strictly less than 12, so files that are 11GiB or less.

For instance, find . -size -1G only finds empty files (files of size 0). A one byte file is considered to be 1GiB as sizes are rounded up to the next GiB.

busybox find supports cwbk suffixes but interprets them differently from GNU find. It’s also currently not POSIX compliant for its handling of sizes without suffixes.

For busybox find, find . -size -12G is like find . -size "-$(( 12 * GiB ))c", and find . -size -1 is for sizes ranging from 0 to 511 instead of just 0.

toybox find (as found on Android for instance) behaves like busybox find in that regard (and is also not POSIX compliant). Another difference is that suffixes are case insensitive there and TPE for tebibyte, pebibyte and exbibyte are also supported and a d (decimal) additional suffix can be used to specify that the units are powers of 1000 rather than 1024. For instance -size 1kd finds files that are exactly 1000 bytes (1 kilobyte) instead of 1024 bytes (1 kibibyte) for -size 1k.

In toybox find, the suffix handling is done as part of its atolx() function which is not only used for find. Note however that since that supports 0xffff hexadecimal numbers, there’s a conflict for cbedCBED that are also hexadecimal digits. -size -0x2c is not for less than 0x2 bytes, but for less than 0x2c (44) 512-byte units. And -size 010c is treated as -size 8c (octal), another POSIX non-conformance.

FreeBSD/DragonFly BSD find support ckMGTP (not bwE) but while it behaves as required by POSIX without suffix, it behaves like busybox/toybox and not GNU find when there’s a suffix².

sfind or the find builtin of the bosh shell behave like FreeBSD’s except suffixes are case insensitive and bwE are supported and octal/decimal numbers and some product arithmetic expressions (such as 6x12x8k) are accepted.

As far as I can tell, all of the OpenBSD, NetBSD, Illumos, Solaris, AIX, HP/UX only support no-suffix for 512-byte units or c for byte as POSIX required.

A summary table:

	Traditional/POSIX	GNU	FreeBSD	sfind	busybox	toybox
suffixes	c	cwbkMG	ckMGTP	cwbkmgtpeCWBKMGTPE	cwbk	cwbkmgtpeCWBKMGTPE (+d)
number format	decimal	decimal	decimal	dec/oct/hex/expr	decimal	dec/oct/hex
`-size $n`	($n-1)512+1 .. $n512	($n-1)512+1 .. $n512	($n-1)512+1 .. $n512	($n-1)512+1 .. $n512	$n*512	$n*512
`-size -$n`	0 .. ($n-1)*512	0 .. ($n-1)*512	0 .. ($n-1)*512	0 .. ($n-1)*512	0 .. $n*512-1	0 .. $n*512-1
`-size +$n`	($n*512)+1 .. ∞	($n*512)+1 .. ∞	($n*512)+1 .. ∞	($n*512)+1 .. ∞	($n*512)+1 .. ∞	($n*512)+1 .. ∞
`-size ${n}c`	$n	$n	$n	$n	$n	$n
`-size -${n}c`	0 .. $n-1	0 .. $n-1	0 .. $n-1	0 .. $n-1	0 .. $n-1	0 .. $n-1
`-size +${n}c`	$n+1 .. ∞	$n+1 .. ∞	$n+1 .. ∞	$n+1 .. ∞	$n+1 .. ∞	$n+1 .. ∞
`-size $n$unit`	N/A	($n-1)$unit+1 .. $n$unit	$n*$unit	$n*$unit	$n*$unit	$n*$unit
`-size -$n$unit`	N/A	0 .. ($n-1)*$unit	0 .. $n*$unit-1	0 .. $n*$unit-1	0 .. $n*$unit-1	0 .. $n*$unit-1
`-size +$n$unit`	N/A	$n*$unit+1 .. ∞	$n*$unit+1 .. ∞	$n*$unit+1 .. ∞	$n*$unit+1 .. ∞	$n*$unit+1 .. ∞

So, in short, for portability, your best bet is to use the c suffix, decimal only numbers without leading zeros and compute the units manually.

For completeness, the L glob qualifier of zsh (with kmgt case insensitive, but pP is for 512-byte unit, not pebibyte) behaves like POSIX/GNU find (*(LM-12) expands to files whose size is in-between 0 and 11 mebibytes for instance).

^{¹ That’s the size as reported in the st_size attribute of the structure returned by lstat() whose meaning for non-regular files can vary between system.}

^{² There’s the same kind of distinction in FreeBSD find/sfind for the -Xtime predicates where for instance -mtime +1 matches on files that are 2 days old or older (age 86400*2 – ∞) while -mtime +1d matches on files that are more than one day old (age 86400.000000001 – ∞). With GNU find, see also ! -newermt -1day (or 1 day ago or yesterday).}

Answered By: Stéphane Chazelas