How can I use sed to replace a multi-line string?

I’ve noticed that, if I add n to a pattern for substituting using sed, it does not match. Example:

$ cat > alpha.txt
This is
a test
Please do not
be alarmed

$ sed -i'.original' 's/a testnPlease do not/not a testnBe/' alpha.txt

$ diff alpha.txt{,.original}

$ # No differences printed out

How can I get this to work?

Asked By: Belmin Fernandez

||

sed has three commands to manage multi-line operations: N, D and P (compare them to normal n, d and p).

In this case, you can match the first line of your pattern, use N to append the second line to pattern space and then use s to do your substitution.

Something like:

/a test$/{
  N
  s/a testnPlease do not/not a testnBe/
}
Answered By: andcoz

Use perl instead of sed:

$ perl -0777 -i.original -pe 's/a testnPlease do not/not a testnBe/igs' alpha.txt
$ diff alpha.txt{,.original}
2,3c2,3
< not a test
< Be
---
> a test
> Please do not

-pi -e is your standard “replace in place” command-line sequence, and -0777 causes perl to slurp files whole. See perldoc perlrun to find out more about it.

Answered By: codehead

In the simplest calling of sed, it has one line of text in the pattern space, ie. 1 line of n delimited text from the input. The single line in the pattern space has no n… That’s why your regex is not finding anything.

You can read multiple lines into the pattern-space and manipulate things surprisingly well, but with a more than normal effort.. Sed has a set of commands which allow this type of thing… Here is a link to a Command Summary for sed. It is the best one I’ve found, and got me rolling.

However forget the “one-liner” idea once you start using sed’s micro-commands. It is useful to lay it out like a structured program until you get the feel of it… It is surprisingly simple, and equally unusual. You could think of it as the “assembler language” of text editing.

Summary: Use sed for simple things, and maybe a bit more, but in general, when it gets beyond working with a single line, most people prefer something else…
I’ll let someone else suggest something else.. I’m really not sure what the best choice would be (I’d use sed, but that’s because I don’t know perl well enough.)


sed '/^a test$/{
       $!{ N        # append the next line when not on the last line
         s/^a testnPlease do not$/not a testnBe/
                    # now test for a successful substitution, otherwise
                    #+  unpaired "a test" lines would be mis-handled
         t sub-yes  # branch_on_substitute (goto label :sub-yes)
         :sub-not   # a label (not essential; here to self document)
                    # if no substituion, print only the first line
         P          # pattern_first_line_print
         D          # pattern_ltrunc(line+nl)_top/cycle
         :sub-yes   # a label (the goto target of the 't' branch)
                    # fall through to final auto-pattern_print (2 lines)
       }    
     }' alpha.txt  

Here it is the same script, condensed into what is obviously harder to read and work with, but some would dubiously call a one-liner

sed '/^a test$/{$!{N;s/^a testnPlease do not$/not a testnBe/;ty;P;D;:y}}' alpha.txt

Here is my command “cheat-sheet”

:  # label
=  # line_number
a  # append_text_to_stdout_after_flush
b  # branch_unconditional             
c  # range_change                     
d  # pattern_delete_top/cycle          
D  # pattern_ltrunc(line+nl)_top/cycle 
g  # pattern=hold                      
G  # pattern+=nl+hold                  
h  # hold=pattern                      
H  # hold+=nl+pattern                  
i  # insert_text_to_stdout_now         
l  # pattern_list                       
n  # pattern_flush=nextline_continue   
N  # pattern+=nl+nextline              
p  # pattern_print                     
P  # pattern_first_line_print          
q  # flush_quit                        
r  # append_file_to_stdout_after_flush 
s  # substitute                                          
t  # branch_on_substitute              
w  # append_pattern_to_file_now         
x  # swap_pattern_and_hold             
y  # transform_chars                   
Answered By: Peter.O

You can but it’s difficult. I recommend switching to a different tool. If there’s a regular expression that never matches any part of the text you want to replace, you can use it as an awk record separator in GNU awk.

awk -v RS='a' '{gsub(/hello/, "world"); print}'

If there are never two consecutive newlines in your search string, you can use awk’s “paragraph mode” (one or more blank lines separate records).

awk -v RS='' '{gsub(/hello/, "world"); print}'

An easy solution is to use Perl and load the file fully into memory.

perl -0777 -pe 's/hello/world/g'

I think, it’s better to replace n symbol with some other symbol, and then work as usual:

e.g. not-worked source code:

cat alpha.txt | sed -e 's/a testnPlease do not/not a testnBe/'

can be changed to:

cat alpha.txt | tr 'n' 'r' | sed -e 's/a testrPlease do not/not a testrBe/'  | tr 'r' 'n'

If anybody doesn’t know, n is UNIX line ending, rn – windows, r – classic Mac OS. Normal UNIX text doesn’t use r symbol, so it’s safe to use it for this case.

You can also use some exotic symbol to temporarily replace n. As an example – f (form feed symbol). You can find more symbols here.

cat alpha.txt | tr 'n' 'f' | sed -e 's/a testfPlease do not/not a testfBe/'  | tr 'f' 'n'
Answered By: xara

All things considered, gobbling the entire file may be the fastest way to go.

Basic syntax is as follows:

sed -e '1h;2,$H;$!d;g' -e 's/__YOUR_REGEX_GOES_HERE__...'

Mind you, gobbling the entire file may not be an option if the file is tremendously large. For such cases, other answers provided here offer customized solutions that are guaranteed to work on a small memory footprint.

For all other hack and slash situations, merely prepending -e '1h;2,$H;$!d;g' followed by your original sed regex argument pretty much gets the job done.

e.g.

$ echo -e "DognFoxnCatnSnaken" | sed -e '1h;2,$H;$!d;g' -re 's/([^n]*)n([^n]*)n/Quick 2nLazy 1n/g'
Quick Fox
Lazy Dog
Quick Snake
Lazy Cat

What does -e '1h;2,$H;$!d;g' do?

The 1, 2,$, $! parts are line specifiers that limit which lines the directly following command runs on.

  • 1: First line only
  • 2,$: All lines starting from the second
  • $!: Every line other than last

So expanded, this is what happens on each line of an N line input.

  1: h, d
  2: H, d
  3: H, d
  .
  .
N-2: H, d
N-1: H, d
  N: H, g

The g command is not given a line specifier, but the preceding d command has a special clause “Start next cycle.“, and this prevents g from running on all lines except the last.

As for the meaning of each command:

  • The first h followed by Hs on each line copies said lines of input into sed‘s hold space. (Think arbitrary text buffer.)
  • Afterwards, d discards each line to prevents these lines from being written to the output. The hold space however is preserved.
  • Finally, on the very last line, g restores the accumulation of every line from the hold space so that sed is able to run its regex on the whole input (rather than in a line-at-a-time fashion), and hence is able to match on ns.
Answered By: antak
sed -e'$!N;s/^(a testn)Please do not be$/not 1Be/;P;D' <in >out

Just widen your window on input a little bit.

It’s pretty easy. Besides the standard substitution; you need only $!N, P, and D here.

Answered By: mikeserv

This is a small modification of xara’s clever answer to make it work on OS X (I’m using 10.10):

cat alpha.txt | tr 'n' 'r' | sed -e 's/a test$(printf 'r')Please do not/not a test$(printf 'r')Be/'  | tr 'r' 'n'

Instead of explicitly using r, you have to use $(printf 'r').

Answered By: abeboparebop
sed -i'.original' '/a test/,/Please do not/c not a test nBe' alpha.txt

Here /a test/,/Please do not/ is considered as a block of (multi line) text, c is the change command followed by new text not a test nBe

In the case of the text to be replaced is very long, I would suggest ex syntax.

Answered By: gibies

I wanted to add a few lines of HTML to a file using sed, (and ended up here). Normally I’d just use perl, but I was on box that had sed,bash and not much else. I found that if I changed the string to a single line and let bash/sed interpolate the tn everything worked out:

HTML_FILE='a.html' #contains an anchor in the form <a name="nchor" />
BASH_STRING_A='apples'
BASH_STRING_B='bananas'
INSERT="t<li>$BASH_STRING_A</li>nt<li>$BASH_STRING_B</li>n<a name="nchor"/>"
sed -i "s/<a name="nchor"/>/$INSERT/" $HTML_FILE

It would be cleaner to have a function to escape the double quotes and forward-slashes, but sometimes abstraction is the thief of time.

Answered By: Alexx Roche

I think this is the sed solution for 2 lines matching.

sed -n '$!N;s@a testnPlease do not@not a testnBe@;P;D' alpha.txt

If you want 3 lines matching then …

sed -n '1{$!N};$!N;s@aaanbbbnccc@xxxnyyynzzz@;P;D'

If you want 4 lines matching then …

sed -n '1{$!N;$!N};$!N;s@ ... @ ... @;P;D'

If the replacement part in the “s” command shrink lines
then a bit more complicated like this

# aaanbbbnccc shrink to one line "xxx"

sed -n '1{$!N};$!N;/aaanbbbnccc/{s@@xxx@;$!N;$!N};P;D'

If the repacement part grow lines then a bit more complicated like this

# aaanbbbnccc grow to five lines vvvnwwwnxxxnyyynzzz

sed -n '1{$!N};$!N;/aaanbbbnccc/{s@@vvvnwwwnxxxnyyynzzz@;P;s/.*n//M;P;s/.*n//M};P;D'

this second method is a simple copy & paste verbatim substitution for usual small sized text files ( need a shell script file )

#!/bin/bash

# copy & paste content that you want to substitute

AA=$( cat <<EOF | sed -z -e 's#([][^$*.#])#\1#g' -e 's#n#\n#g'
a test
Please do not
EOF
)

BB=$( cat <<EOF | sed -z -e 's#([&#])#\1#g' -e 's#n#\n#g'
not a test
Be
EOF
)

sed -z -i 's#'"${AA}"'#'"${BB}"'#g' *.txt   # apply to all *.txt files
Answered By: mug896

Apart from Perl, a general and handy approach for multiline editing for streams (and files too) is:

First create some new UNIQUE line separator as you like, for instance

$ S=__ABC__                     # simple
$ S=__$RANDOM$RANDOM$RANDOM__   # better
$ S=$(openssl rand -hex 16)     # ultimate

Then in your sed command (or any other tool) you replace n by ${S}, like

$ cat file.txt | awk 1 ORS=$S |  sed -e "s/a test${S}Please do not/not a testnBe/" | awk 1 RS=$S > file_new.txt

( awk replaces ASCII line separator with yours and vice versa. )

Answered By: guest

GNU sed has a -z option that allows to use the syntax the OP attempted to apply. (man page)

Example:

$ cat alpha.txt
This is
a test
Please do not
be alarmed
$ sed -z 's/a testnPlease do notnbe/not a testnBe/' -i alpha.txt
$ cat alpha.txt
This is
not a test
Be alarmed

Be aware: If you use ^ and $ they now match the beginning and end of lines delimited with a NUL character (not n). And, to ensure matches on all your (n-separated) lines are substituted, don’t forget to use the g flag for global substitutions (e.g. s/.../.../g).


Credits: @st├ęphane-chazelas first mentioned -z in a comment above.

Answered By: Peterino

Sed breaks input on newlines. It keeps only one line per loop.
Therefore there is no way to match a n (newline) if the pattern space doesn’t contain it.

There is a way, though, you can make sed keep two consecutive lines in the pattern space by using the loop:

sed 'N;l;P;D' alpha.txt

Add any processing needed between the N and the P (replacing the l).

In this case (2 lines):

$ sed 'N;s/a testnPlease do not/not a testnBe/;P;D' alpha.txt
This is
not a test
Be
be alarmed

Or, for three lines:

$ sed -n '1{$!N};$!N;s@a testnPlease do notnbe@not a testnDonBe@;P;D' alpha.txt 
This is
not a test
Do
Be alarmed

That’s assuming the same amount of lines gets replaced.

Answered By: user232326

While ripgrep specifically doesn’t support inline replacement, I’ve found that its current --replace functionality is already useful for this use case and is preferable to using sed, e.g.:

rg --replace $'not a testnBe' --passthru --no-line-number 
--multiline 'a testnPlease do not' alpha.txt > output.txt

Explanation:

  • --replace 'string' enables replacement mode and sets the replacement string. Can include captured regex groups by using $1 etc.
  • $'string' is a Bash expansion so that n becomes a newline for a multiline string.
  • --passthru is needed since ripgrep usually only shows the lines matching the regex pattern. With this option it also shows all lines from the file that don’t match.
  • --no-line-number / -N is because by default ripgrep includes the line numbers in the output (useful when only the matching lines are shown).
  • --multiline / -U enables multiline processing which is disabled by default.
  • > output.txt, with the --passthrough and no-line-number options the standard output matches the desired new file with replacements and can be saved as usual.
  • --multiline-dotall can optionally be added if you want the dot (‘.’) regex pattern to match newlines (n).

However, this command isn’t as useful for processing multiple files, as it needs to be run separately per file.

Answered By: Silveri

Since this already explains most sed operations, I’ll add how you can search within a block.

Suppose you want to change x within padding but not offset:

{
  padding: {
    x: 2,
    y: 0
  },
  offset: {
    x: 0,
    y: 1
  }
}

You first select the block from padding: { to }

sed -r '/padding: {/,/}/ {
    # and inside the block you replace the value of x:
    s/^( +x:).*/1 1,/
}'

This also works to answer the question though it is not as elegant as the JSON sample:

echo -e 'This isna testnPlease do notnbe alarmed' | sed -r '
  /a test/,/Please do not/ {
    s/a test/not a test/
    s/Please do not/Be/
  }'
Answered By: laktak

Expanding on Peter.O’s brilliant accepted answer, if you are like me and need a solution to replace more than 2 lines in 1 go, try this:

#!/bin/bash

pattern_1="<Directory "/var/www/cgi-bin">"
pattern_2="[ ]*AllowOverride Nonen"
pattern_3="[ ]*Options +ExecCGIn"
pattern_4="[ ]*AddHandler cgi-script .cgi .pln"
pattern_5="[ ]*Require all grantedn"
pattern_6="</Directory>"

complete_pattern="$pattern_1n$pattern_2$pattern_3$pattern_4$pattern_5$pattern_6"

replacement_1="#<Directory "/var/www/cgi-bin">n"
replacement_2="    #AllowOverride Nonen"
replacement_3="    #Options +ExecCGIn"
replacement_4="    #AddHandler cgi-script .cgi .pln"
replacement_5="    #Require all grantedn"
replacement_6="#</Directory>"

complete_replacement="$replacement_1$replacement_2$replacement_3$replacement_4$replacement_5$replacement_6"

filename="test.txt"

echo ""
echo "SEDding"
sed -i "/$pattern_1/{
    N;N;N;N;N
    s/$complete_pattern/$complete_replacement/
}" $filename

Let your input file be:

#
#This is some test comments
#    Skip this
#

<Directory "/var/www/cgi-bin">
    AllowOverride None
    Options +ExecCGI
    AddHandler cgi-script .cgi .pl
    Require all granted
</Directory>

After running the sed script, the file will be replaced in-place to:

#
#This is some test comments
#    Skip this
#

#<Directory "/var/www/cgi-bin">
    #AllowOverride None
    #Options +ExecCGI
    #AddHandler cgi-script .cgi .pl
    #Require all granted
#</Directory>

Explanation

  • pattern_1="<Directory "/var/www/cgi-bin">" — Special characters must be escaped with a backslash .

  • [ ]* — This will match 0 or many whitespaces. Standard RegEx notation

  • sed -i "/$pattern_1/{ — This will search the file, line-by-line, for pattern_1 [<Directory "/var/www/cgi-bin">]. Note that the search pattern MUST NOT CONTAIN NEWLINE

    If, and only if, sed finds $pattern_1 in the file, then it will proceed with executing the sub-code within the curly braces {}. It will start from the pattern matching line of file

  • N;N;N;N;NN tells sed to read the next line after the pattern and attach it to current line. It is important to understand that sed is designed to replace only 1 line at a time, so N will basically force sed to read 2 lines and consider them as a single line with a single newline n between the two. The newline in the second line will be ignored. By chaining 5 Ns, we are instructing sed to read 6 lines of the file starting with the pattern matching line.

  • s/$complete_pattern/$complete_replacement/ — Replace $complete_pattern with $complete_replacement. Note the presence of newlines in the variables. Understanding this part will take some trial and error.

Answered By: Adhitya Ganesan

Another option is sed ranges. I’m not sure it’s applicable for this specific question, but as I landed to here and found sed ranges useful my case, I find a value in sharing the following answer here: Using sed to replace multiline

Answered By: MaMazav
Categories: Answers Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.