sed: update 2 similar variables in a file but keep the upper and lowercase
I have 2 variables in a file like below that I need to assign a new value inserted by one user input: read -p "Enter CName Name : " CName
sid=C02SBX
SID=C02SBX
When I run the following GNU sed
statement:
sed -i "s/sid=.*/sid=$CName/gI" dbca2.rsp
it will update the variables as follow:
sid=C03SBX
sid=C03SBX
Question: How could I make sure that the second sid
before the =
sign should always remain upper case e.g. SID=C03SBX
while the first sid
stays lowercase. Also both sid
and SID
after the =
sign should always be uppercase e.g (=C03SBX
) no matter if the user input lower or uppercase?
Use grouping (...)
and referencing (1
for the first group, 2
for the second etc.):
sed -i "s/(sid=).*/1${CName^^}/gI" dbca2.rsp
For turning the user input to uppercase, the shell is used: ${var^^}
As a rule, you should not embed data in the code arguments of language interpreters whether they’re shells, sed
, awk
, perl
, python
, etc.
Doing so invariably introduces command injection vulnerabilities.
So things like , sed -e "code $data"
, sh -c "code $data"
, eval "code $data"
(note the double quotes inside which the perl -e "code $data"
$var
s are expanded to the contents of the shell variable) should be removed from your vocabulary.
The data should be passed via channels intended for data, not code. sed
(contrary to sh
, awk
and any modern or less modern programming language) doesn’t have such channels, so the only option there is to sanitise that data, but then again there have been better alternatives to sed
since the 80s.
Here, you can use:
perl -pi -s -e 's/bsid=K.*/U$value/i' -- -value="$CName" -- "$file"
Note how the code argument is fixed: s/bsid=K.*/U$value/i
, and the data (the contents of the $CName
shell variable) is passed as a separate non-code argument, one intended to pass values (by assigning to the $value
perl variable).
Also note:
-i
(copied fromperl
) is a non-standard option insed
.- same for the
I
flag to thesed
command. - no need for
g
as the.*
will match to the end of the line so there can only be one substitution per line. It could make sense ins/bsid=KS*/U$value/gi
where we matchsid=
followed by any number of non-whitespace characters where it would then changesid=foo SID=bar
on a single line tosid=new-value SID=new-value
. perl
(contrary tosed
which interpret the input as text encoded in the user’s locale) by default considers each byte as a character, so it will still work correctly if the file contains bytes not forming valid characters in the user’s locale.U
(from ex/vi) turns what follows to uppercase. Note that by default, it only works for ASCII letters. Also supported by somesed
implementations but not all and not standard. Alternatively, you could have the shell convert the value to uppercase withtypeset -u CName
in ksh/zsh, or use$CName:u
in tcsh/zsh or${(U)CName}
in zsh, or${CName^^}
in bash. Those would generally interpret the text in the locale’s encoding and change the case according to the locale’s rules. So for instance,i
could be changed toI
orİ
depending on the locale, and thatİ
could be encoded as a 0xdd byte or the bytes 0xc4 0xb0 depending on whether the locale uses iso8859-9 or UTF-8 as the charmap for instance.- note the
b
for wordb
oundary, so it doesn’t matchsid=foo
insetsid=foo
for instance. Also supported by somesed
implementations while some others have<
/>
(from ex/vi) or[[:<:]]
/[[:>:]]
for that instead. K
specifies the start of what is to beK
ept in the match. With old versions ofperl
, you’d uses/(sid=).*/$1U$value/i
, that is capture thesid=
part with parenthesis and recall it in the replacement with$1
. Or use a look behind operators/(?<=sid=).*/U$value/i
though makes it look a bit too complex IMO.- as an alternative to
-s
with-var=value
, you can use environment variables to pass the data:VALUE=$CName perl -pi -e 's/bsid=K.*/U$ENV{VALUE}/i' -- "$file"
. That would be preferable if the value is sensitive and should not be exposed in the output ofps -Af
. That’s also the approach you’d use inawk
(ENVIRON["VALUE"]
there) as the other value passing channels there mangle data containing backslashes. - to read an arbitrary string from the user, the syntax in
bash
isIFS= read -rp 'Prompt: ' var
(IFS= read -r 'var?Prompt: '
in other Korn-like shells). But prompting the user is generally not the best approach as it makes your script unnecessarily harder to automate or reuse with the same parameters. It’s generally better to take input as command line arguments, such asyou-script thenewCN
where the argument is available in$1
or do proper standard command line parsing with the standardgetopts
builtin.