SED and REGEX extraction and rejecting if pattern not found

In the SED command, I want to remove everything that comes after 2000_SOMENAME also if possible want to give an error if that format is not found.

For example, if the file name is ITALY_2022_BEST1FRIENDS2_ROME.txt . I only want 2022_BEST1FRIENDS2
if that pattern is not found I want to give an error in the shell script.

username=$(find . -iname '*.txt' | sed -e 's/.*_([0-9]{4}_[0-9|A-z]*).*/1/i' | sort - | uniq -ui |tr -d 'n')

The previous question and more info here: Extracting a partial part from filename using SED
Thank you!!

Asked By: TigerStrips

||

In this particular case, it probably makes more sense to use grep -o rather than sed, because grep will exit with an error if it finds no results. (The -o makes it only return the matched portion and not the whole line.)

The tricky thing is that you’re piping to other commands, and you want to preserve that exit status in the case of an error.

If using bash, you could get the pipe to fail if any component fails with set -o pipefail (set it back with set +o pipefail afterwards to reset it if you want).

There may be similar methods for other shells.

set -o pipefail
username="$(find . -iname '*.txt' | grep -o -i '[0-9]{4}_[0-9A-Z]*' | sort - | uniq -ui |tr -d 'n')"
# get the exit status of the previous command
pipeexit="$?"
set +o pipefail
if [[ "$pipeexit" != 0 ]] ; then
    echo "username not found" >&2
    # line below quits the script; remove if you don't want that
    exit "$pipeexit"
fi

I followed your lead in making the pattern case insensitive (-i to grep; you had i in your sed command), and left the rest of the commands in the pipe unmodified: I assume you had a reason for them. (Though the tr command seems suspect; why mash all the results together on one line?)

You might also consider a simpler method to check for an "error": just checking if the $username variable is empty, which it would be if there were no grep results (but of course also if no .txt files were found by find, etc.; not sure if you wanted that …).

username="$(find . -iname '*.txt' | grep -o -i '[0-9]{4}_[0-9A-Z]*' | sort - | uniq -ui |tr -d 'n')"
if [ -z "$username" ] ; then
    echo "username not found" >&2
    exit "$pipeexit"
fi

Which would work in other shells most likely…

Answered By: frabjous