Using sed to find and replace complex string (preferrably with regex)
I have a file with the following contents:
<username><![CDATA[name]]></username>
<password><![CDATA[password]]></password>
<dbname><![CDATA[name]]></dbname>
and I need to make a script that changes the “name” in the first line to “something”, the “password” on the second line to “somethingelse”, and the “name” in the third line to “somethingdifferent”. I can’t rely on the order of these occurring in the file, so I can’t simply replace the first occurrence of “name” with “something” and the second occurrence of “name” with “somethingdifferent”. I actually need to do a search for the surrounding strings to make sure I’m finding and replacing the correct thing.
So far I have tried this command to find and replace the first “name” occurrence:
sed -i "s/<username><![CDATA[name]]></username>/something/g" file.xml
however it’s not working so I’m thinking some of these characters might need escaping, etc.
Ideally, I’d love to be able to use regex to just match the two “username” occurrences and replace only the “name”. Something like this but with sed
:
<username>.+?(name).+?</username>
and replace the contents in the brackets with “something”.
Is this possible?
sed -i -E "s/(<username>.+)name(.+</username>)/1something2/" file.xml
This is, I think, what you’re looking for.
Explanation:
- parentheses in the first part define groups (strings in fact) that can be reused in the second part
1
,2
, etc. in the second part are references to the i-th group captured in the first part (the numbering starts with 1)-E
enables extended regular expressions (needed for+
and grouping).-i
enables "in-place" file edit mode
For replace the “name” word with the “something” word, use:
sed "s/(<username><![[A-Z]*[)name]/1something/g" file.xml
That is going to replace all the occurrences of the specified word.
So far all is outputted to standard output, you can use:
sed "s/(<username><![[A-Z]*[)name]/1something/g" file.xml > anotherfile.xml
to save the changes to another file.
sed -e '/username/s/CDATA[name]/CDATA[something]/'
-e '/password/s/CDATA[password]/CDATA[somethingelse]/'
-e '/dbname/s/CDATA[name]/CDATA[somethingdifferent]/' file.txt
The /username/
before the s
tells sed to only work on lines containing the string ‘username’.
You need to quote [.*^$/
in the regular expression part of the s
command and &/
in the replacement part, plus newlines. The regular expression is a basic regular expression, and in addition you need to quote the delimiter for the s
command.
You can pick a different delimiter to avoid having to quote /
. You’ll have to quote that character instead, but usually the point of changing the delimiter is to pick one that doesn’t occur in either the text to replace or the replacement text.
sed -e 's~<username><![CDATA[name]]></username>~<username><![CDATA[something]]></username>~'
You can use groups to avoid repeating some parts in the replacement text, and accommodate variation on these parts.
sed -e 's~(<username><![[A-Z]*[)name(]]></username>)~1something2~'
sed -e 's~(<username>.*[^A-Za-z][)name([^A-Za-z].*</username>)~1something2~'
If sed
is not a hard requirement, better use a dedicated tool instead.
If your file is valid XML (not just those 3 XML-looking tags), then you can use XMLStarlet:
xml ed -P -O -L
-u '//username/text()' -v 'something'
-u '//password/text()' -v 'somethingelse'
-u '//dbname/text()' -v 'somethingdifferent' file.xml
The above will also work in situations which would be difficult to solve with regular expressions:
- Can replace the values of the tags without specifying their current values.
- Can replace the values even if they are just escaped and not enclosed in CDATA.
- Can replace the values even if the tags have attributes.
- Can easily replace just occurrences of tags, if there are multiple with the same name.
- Can format the modified XML by indenting it.
Brief demonstration of the above:
bash-4.2$ cat file.xml
<sith>
<master>
<username><![CDATA[name]]></username>
</master>
<apprentice>
<username><![CDATA[name]]></username>
<password>password</password>
<dbname foo="bar"><![CDATA[name]]></dbname>
</apprentice>
</sith>
bash-4.2$ xml ed -O -u '//apprentice/username/text()' -v 'something' -u '//password/text()' -v 'somethingelse' -u '//dbname/text()' -v 'somethingdifferent' file.xml
<sith>
<master>
<username><![CDATA[name]]></username>
</master>
<apprentice>
<username><![CDATA[something]]></username>
<password>somethingelse</password>
<dbname foo="bar"><![CDATA[somethingdifferent]]></dbname>
</apprentice>
</sith>
$ sed -e '1s/name/something/2'
-e '3s/name/somethingdifferent/2'
-e 's/password/somethingelse/2' sample.xml
You can simply use addresses as in the number preceding “s” which indicates the line number.
Also the number in the end tells sed
to replace the second match instead of replacing the first match.
Usage: sed [OPTION]... {script-only-if-no-other-script} [input-file]...
-r, --regexp-extended
use extended regular expressions in the script.
so to replace value in a properties file
sed -i -r 's/MAIL=(.+)/MAIL=user@mymail.com/' etc/service.properties