Unable to grep foreign language in shell script

I am a newbie in shell scripting, I have a text which contain text in following format:-

"some foreign language",'corresponding ID to text'

for example:-


I need to find the text related to ID and save in in text file.

Here my sample script:-

translation=$(cat $target_file | grep $translationID)
translationValue=$(echo "$translation" | awk -F',' '{print $1}')
translationValueFinal=$(echo "$translationValue" | tr -d '"')
echo "$translationValueFinal" >> $output

while running this script I am getting error :-grep: (standard input): binary file matches

Please suggest a way to grep and save a foriegn language in shell script. Thanks

Asked By: tabish


If you use the GNU grep, you can tell grep to treat the input as text no matter what characters it encounters.

grep -a

But it seems there are some non-textual bytes in the input, so better check the input file.

Answered By: choroba

Don’t use grep and a bunch of extra code for this as you want to do a literal string match on a specific field, which grep cannot do on it’s own, and the tools that can do it don’t require help from other tools.

Your existing command:

grep $translationID

even if we added the missing "s to make it grep "$translationID" would fail if any of these conditions were true:

  1. The string in the first field matched the id, e.g. IDC_SSB_DLG_BACK_BTN,any, or
  2. The string in either field contained a different string which that ID was a substring of, e.g. any,FOOIDC_SSB_DLG_BACK_BTNBAR or FOOIDC_SSB_DLG_BACK_BTNBAR,any.
  3. The string in the second field and the ID variable contained a regexp metachar, e.g. any,foo.bar and any,foodbar would both match translationID=foo.bar.

and probably others. See how-do-i-find-the-text-that-matches-a-pattern for more info on some of these types of issue.

Using this input file, for example:

$ cat file

where we want to print the value from the first field when the second field is the string foo.bar (i.e. just the last line above):

$ translationID=foo.bar

here’s your grep command finding the expected line but also making many false matches and so outputting undesirable lines:

$ grep "$translationID" file

vs this awk command only matching the correct line (as well as outputting just the desired field):

$ awk -F',' -v id="$translationID" '$2==id{print $1}' file

or if you want the quotes removed there are lots of options including:

$ awk -F'[,"]+' -v id="$translationID" '$3==id{print $2}' file

That awk command is doing a full-field literal* string comparison of just the target field so it will be accurate whereas your grep command is doing a partial-line regexp comparison which will fail some time unless you’re lucky with your input values.

*slight caveat – if translationID contains backslashes that you want treated literally then you need to do:

$ id="$translationID" awk -F',' '$2==ENVIRON["id"]{print $1}' file

or similar instead, see how-do-i-use-shell-variables-in-an-awk-script.

If your input file can contain NUL chars then use GNU awk or some other awk that documents they support that since awk is a text processing tool and so is only required to work with text files as input and per the POSIX definition a text file cannot contain UL chars, and with GNU awk you MAY need to set BINMODE, e.g.:

awk -v BINMODE=3 -F',' -v id="$translationID" '$2==id{print $1}' file
Answered By: Ed Morton
