bash + how to verify folders under specific path are ended with number/s

I want to check the folder/s under /var/kafka that all them are ended with number otherwhile I will exit with error

ls -ltr  /var/kafka
drwxr-xr-x 399 kafka kafka 28672 Nov  9 13:10 data6
drwxr-xr-x 392 kafka kafka 28672 Nov  9 13:10 data5
drwxr-xr-x 391 kafka kafka 28672 Nov  9 13:10 data1
drwxr-xr-x 405 kafka kafka 28672 Nov  9 13:10 data2
drwxr-xr-x 386 kafka kafka 28672 Nov  9 13:10 data4
drwxr-xr-x 389 kafka kafka 28672 Nov  9 13:10 data8
drwxr-xr-x 406 kafka kafka 28672 Nov  9 13:10 data7
drwxr-xr-x 397 kafka kafka 28672 Nov  9 13:10 data3

so I create the following simple code ( that should be in bash script )

for i in `  ls -ltr  /var/kafka | awk '{print $NF}' `; do  [[ ! $i  = *?[0-9] ]] && echo "ERROR folder/s not ended with number" && exit 1 ; done

but this above example is too long code , and I want to find other elegant short command that simply verify if folder/s under /var/kafka are ended with number or not,

we can use awk/sed/ or Perl on liner also.

Asked By: yael

||

With zsh:

files=( /var/kafka/*[^0-9](ND) )
if (( $#files )); then
  print -rlu2 -- "There are files whose name doesn't end in an ASCII digit:" ' - '$^files
  exit 1
fi

Same with bash:

shopt -s nullglob dotglob
shopt -u failglob
files=( /var/kafka/*[^0123456789] )
if (( ${#files[@]} )); then
  echo>&2 "There are files whose name doesn't end in an ASCII digit:"
  printf>&2 ' - %sn' "${files[@]}"
  exit 1
fi

To show only the basename / tail of those files (foo instead of /var/kafka/foo), replace $^files with $^files:t and "${files[@]}" with "${files[@]##*/}".

Note that strictly speaking *[^0-9] (or *[^0123456789] in bash) matches on file names that end in a character other than a digit, for file names that don’t end with a digit, that’s ^*[0-9] in zsh (for which you need set -o extendedglob or !(*[0-9]) in bash (for which you need shopt -s extglob) but given that a file name cannot be empty, that should be equivalent.

If you don’t need to list those files in the error message, in zsh, that can be shortened to:

if () (( $# )) /var/kafka/*[^0-9](NDY1); then
  print -ru2 "There are files whose name doesn't end in an ASCII digit"
  exit 1
fi

Where Y1 stops looking at the first match, and the list of matches is passed to an anonymous function that returns true if passed at least one argument ($# is not zero).

If you need to consider only the files of type directory, still in zsh, add / to the glob qualifier:

dirs=( /var/kafka/*[^0-9](ND/) )
if (( $#dirs )); then
  print -rlu2 -- "There are directories whose name doesn't end in an ASCII digit:" ' - '$^dirs
  exit 1
fi

bash doesn’t have glob qualifiers, but you could resort to find to report the files of type directory whose name doesn’t end in digits:

With bash4.4 or newer and a find that supports -print0:

readarray -td '' dirs < <(
  cd /var/kafka &&
    LC_ALL=C find . ! -name . -prune -type d ! -name '*[0-9]' -print0
)
if (( ${#dirs[@]} )); then
  echo>&2 "There are dirs whose name doesn't end in an ASCII digit:"
  printf>&2 ' - %sn' "${dirs[@]}"
  exit 1
fi

With older versions of bash, you can always fill up the array in a loop:

dirs=()
while IFS= read -rd '' dir; do
  dirs=("${dirs[@]}" "$dir")
done < <(
  cd /var/kafka &&
    LC_ALL=C find . ! -name . -prune -type d ! -name '*[0-9]' -print0
)

We use LC_ALL=C so [0-9] only matches on the 0123456789 and not the thousands of other characters that often happen to sort between 0 and 9 in other locales, and also so that the * in *[0-9] which matches on 0 or more characters match on 0 or more bytes instead.

Now beware that some locales use a character encoding where some additional characters, other than 0123456789 have and encoding that end in the same byte value as 0123456789. For instance in the GB18030 charset as used in China:

$ LC_ALL=zh_CN.gb18030 luit
$ locale charmap
GB18030
$ printf %s '¾' | LC_ALL=C od -tx1 -tc
0000000  81  30  86  36
        201   0 206   6
0000004

The ¾ character is often matched by [0-9] regardless of how it’s encoded because it sorts between 0 and 9 for obvious reasons, but also, its GB18030 encoding ends in byte 0x36 which happens to also be the encoding of the 6 ASCII digit character.

So in the C locale, the file path that is made of the GB18030 encoding of /var/kafka/¾¾¾ will be considered to end in an ASCII digit and won’t be reported. Whether it should or not is another matter.

Answered By: Stéphane Chazelas

Using find:

LC_ALL=C find /var/kafka -mindepth 1 -maxdepth 1 -type d -regex '.*[^0-9]$' | grep '^' 
&& echo "Error in folder name"

I’m "abusing" grep to get a sensible return code, if someone knows of a better way, please let me know.

Answered By: Panki

Another variant:

[ -z "$(LC_ALL=C find /var/kafka -mindepth 1 -maxdepth 1 -type d ( -name '*[0-9]' -o -print -quit ))" ]

This returns true ($? is 0) if all directory names end in a digit

Answered By: Chris Davies
Categories: Answers Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.