How to match words and ignore multiple spaces?
The following syntax should match the “Ambari Server running”, but how to match in case there are multiple spaces between words? How to ignore spaces between words?
echo "Ambari Server running" | grep -i "Ambari Server running"
echo "Ambari Server running" | grep -i "Ambari Server running"
echo " Ambari Server running" | grep -i "Ambari Server running"
The expected results should be:
Ambari Server running
Ambari Server running
Ambari Server running
Use Regex operator +
to indicate one or more of the preceding token, space in this case. So the pattern would be +
:
echo "Ambari Server running" | grep -i "Ambari +Server +running"
I would suggest to use character class [:blank:]
to match any horizontal whitespace, not just plain space, if you are unsure:
echo "Ambari Server running" | grep -i "Ambari[[:blank:]]+Server[[:blank:]]+running"
On the other hand, if you want to keep just one space between words, use awk
:
echo "Ambari Server running" |
awk '$1=="Ambari" && $2=="Server" && $3=="running" {$1=$1; print}'
-
$1=="Ambari" && $2=="Server" && $3=="running"
matches the desired three fields -
{$1=$1}
rebuilds the record with space as the new separator -
{print}
prints the record
If you just want to ignore all the space in between you can use echo your text |tr -d [[:space:]]| grep "yourtext"
but the output will not have any space.
Example:
echo "Hi This Is Test" |tr -d [[:space:]] |grep HiThisIsTest
Output:
HiThisIsTest
Use tr
with its -s
option to compress consecutive spaces into single spaces and then grep
the result of that:
$ echo 'Some spacious string' | tr -s ' ' | grep 'Some spacious string'
Some spacious string
This would however not remove flanking spaces completely, only compress them into a single space at either end.
Using sed
to remove the flanking blanks as well as compressing the internal blanks to single spaces:
echo ' Some spacious string' |
sed 's/^[[:blank:]]*//; s/[[:blank:]]*$//; s/[[:blank:]]{1,}/ /g'
This could then be passed through to grep
.
To answer the main question of How to match words and ignore multiple spaces?
Something like the following will help you get what you need:
echo "Ambari Server running" | tr '[:upper:]' '[:lower:]' | grep -E 's*ambaris+servers+runnings*'
It takes the input and makes it lower case then searches for matches that are lower case. We use s*
for 0 or more whitespace (so will include tabs etc.) and s+
for 1 or more whitespace.
If your input was in a file like foo2.txt
below:
Ambari Server running
Ambari Server running
Ambari Server running
Then you could do something like:
cat foo2.txt | tr '[:upper:]' '[:lower:]' | grep -E 's*ambaris+servers+runnings*'
ambari server running
ambari server running
ambari server running
If you are just interested in the count, you can modify it a little to be like:
cat foo2.txt | tr '[:upper:]' '[:lower:]' | grep -E 's*ambaris+servers+runnings*' | wc -l