remove most but not all lines containing carriage return (r)
I have a process which outputs too many status-lines with carriage return (r). I can filter all those status lines by piping them through
sed '/r/d'
I would instead like to filter all of these lines except, e.g. every 3.
Is this possible with standard Unix-Tools (awk?) or do I need a script for that? Lines without CR should be left untouched.
Given Output:
$ (printf '%sn' {1..10}; printf '%srn' {1..10}; printf '%sn' {1..10};) | cat -v
1
2
3
4
5
6
7
8
9
10
1^M
2^M
3^M
4^M
5^M
6^M
7^M
8^M
9^M
10^M
1
2
3
4
5
6
7
8
9
10
Wanted output (or any other pattern):
1
2
3
4
5
6
7
8
9
10
1^M
4^M
7^M
10^M
1
2
3
4
5
6
7
8
9
10
$ awk '!(/r$/ && ((++c)%3 != 1))' file | cat -v
1
2
3
4
5
6
7
8
9
10
1^M
4^M
7^M
10^M
1
2
3
4
5
6
7
8
9
10
Original answer:
Sounds like all you need is this, using any awk:
awk -v RS='r' '{ORS=(NR%10000 ? "" : RS)} 1'
e.g. using this as input:
$ printf '%srn' {1..10} | cat -v
1^M
2^M
3^M
4^M
5^M
6^M
7^M
8^M
9^M
10^M
Removing all but every 3rd r
:
$ printf '%srn' {1..10} | awk -v RS='r' '{ORS=(NR%3 ? "" : RS)} 1' | cat -v
1
2
3^M
4
5
6^M
7
8
9^M
10
Using GNU sed
, we use the hold space for counting.
sed -E '
/r$/{
G;/n$/P
s/.*n/./
/.{3}/z;x;d
}
' file
Using awk
, we Use the variable c as a circular counter that gets reset whenver it reaches 3.
awk '
!/r$/ || !c++
c==3{c=0}
' file
Assuming the carriage returns (r
) , whenever they occur, occur at the end of a line feed (n
) delimited record.
Here’s a fringe way to do it in awk :
{m,g}awk '((+$_ % 3) % NF)~(!_<NF)' FS='r$' # yes that's a
# tilde ~ not a minus -
1
2
3
4
5
6
7
8
9
10
1^M
4^M
7^M
10^M
1
2
3
4
5
6
7
8
9
10
Other ways of saying the same thing
mawk 'NF-!_== (+$+_ % 3 ) % NF' FS='r$'
gawk 'NF-!_== ( $(_++)%(_+_+_--)) % NF' FS='r$'