# Making single digit numbers between 1 and 9 into double digits, inside CSV file

I have a CSV file with thousands of lines like these

``````1664;4;5;35;37;43;5;6
1663;21;23;32;40;49;8;11
1662;16;17;34;35;44;5;10
1661;2;9;23;32;40;6;7
1660;23;25;30;44;47;9;12
1659;3;5;9;32;43;6;10
1658;4;6;10;13;34;3;5
1657;8;9;33;35;40;3;6
1656;15;20;31;44;48;1;3
1655;25;27;35;40;45;7;11
1654;7;32;33;34;38;6;9
1653;5;7;11;27;37;6;12
1652;7;31;33;35;36;7;10
1651;4;12;34;35;45;1;9
1650;5;8;29;35;48;5;6
1649;2;11;28;42;48;4;9
1648;2;11;12;19;38;4;8
``````

You can see that the numbers between 1 and 9 are single digit.

How can I use `sed` or something to covert these number into double digit, by preceding a zero to them, such as

``````01 02 03 04 05 06 07 08 09
``````

``````1 2 3 4 5 6 7 8 9
``````

Given your input sample, it looks like

``````perl -pi -e 's/bdb/0\$&/g' file.csv
``````

Should do it.

That prepends a `0` to all digits that are both preceded and followed by a word `b`oundary. Replace `d` with `[1-9]` if you want `0` to be left alone as the wording of your question suggests.

The same with the GNU implementation of `sed`

``````sed -i 's/b[[:digit:]]b/0&/g' file.csv
``````

With recent versions of `mlr` (miller), you can also do:

``````mlr --csv -N --fs ';' put '\$* = fmtifnum(\$*, "%02d")'
``````

To format all numeric fields as integers 0-padded to a length of 2.

That will leave numbers in non-numeric fields (like in `"foo 1 and 2";x-2`) untouched, but note that it will also reformat non-integer numbers (`1e0`, `10e-1`, `1.123`, `0x1` would all be changed to `01` for instance).

To only pad the fields that consist of only one decimal digit (quoted or not):

``````mlr --csv -N --fs ';' put 'for (k in \$*) {\$[k] = sub(string(\$[k]),"^d\$","0")}'
``````

You can use `sed` to pick out a "word" that’s a single digit, and prefix it with zero:

``````sed -E 's/<([0-9])>/01/g' {file}
``````

Overall:

• The `-E` flag ensures that `sed` will use an Extended Regular Expression (ERE) rather than a standard/ordinary Regular Expression.
• `s/ …pattern… / …replacement… /g` defines a substitution that applies an ERE match on `…pattern…` and corresponding replacement with `…replacement…`. Because of the `g` ("global") flag the match/replace is applied as many times as possible for a line.
• There is no line selection so implicitly it applies this rule to every line in the input.

In the ERE match pattern:

• `<` and `>` represent a word boundary
• `(` and `)` identify a grouped expression that can be reused later
• `[0-9]` is any single digit

In the ERE substitution:

• `0` is a literal `0` character
• `1` interpolates the text captured by the first grouped expression, i.e. an instance of a single digit

Output:

``````1664;04;05;35;37;43;05;06
1663;21;23;32;40;49;08;11
1662;16;17;34;35;44;05;10
1661;02;09;23;32;40;06;07
1660;23;25;30;44;47;09;12
1659;03;05;09;32;43;06;10
1658;04;06;10;13;34;03;05
1657;08;09;33;35;40;03;06
1656;15;20;31;44;48;01;03
1655;25;27;35;40;45;07;11
1654;07;32;33;34;38;06;09
1653;05;07;11;27;37;06;12
1652;07;31;33;35;36;07;10
1651;04;12;34;35;45;01;09
1650;05;08;29;35;48;05;06
1649;02;11;28;42;48;04;09
1648;02;11;12;19;38;04;08
``````

In this case it doesn’t matter whether the file really is CSV (fields separated by comma, "`,`"), or if it has fields separated by semicolon ("`;`"), or even fields separated by whitespace.

If your `sed` supports `-i` you can edit the file "in place", but note that in common with most other utilities the "in place" effect is actually implemented behind the scenes as "write to a temporary file, and then replace the original file with that temporary one":

``````sed -E 's/<([0-9])>/01/g' file >file.\$\$.tmp &&
mv -f file.\$\$.tmp file
rm -f file.\$\$.tmp
``````

Here is a solution using `awk`:

``````awk 'BEGIN{FS=OFS=";"} {for (i=1;i<=NF;i++) \$i=sprintf("%02d",\$i)} 1' file.csv
``````

This will first set the input and output field separator to `;`. It will then iterate over all fields and normalize the field width to at least two digits with left zero-padding by using `sprintf()` (which effectively won’t change the field content if it is already a two-digit number).

The seemingly stray `1` outside of the action blocks instructs `awk` to print the current line including all changes.

Note that `awk` will not by default modify the file, but only print to `stdout`, so you will either

• have to redirect output to a temporary file and rename, as in
``````awk ' ... ' file.csv > output.csv && mv output.csv file.csv
``````
• or use an `awk` version that supports the `-i inplace` extension:
``````awk -i inplace ' ... ' file.csv
``````

Also note that this particular case is tailored to your example input which only comprises single- or double-digit numbers. For the more general case of padding to `n` digits, you would use

``````awk -v n=3 'BEGIN{FS=OFS=";"} {for (i=1;i<=NF;i++) \$i=sprintf("%0*d",n,\$i)} 1' file.csv
``````

Using Miller with type inference (`-A`) and outputing integer values with zero padding and minimum field width of 2:

``````mlr -A --nidx --fs ';' --ofmt '%02d' cat file
``````

Note that based on your input sample I’ve assumed the data is actually index-numbered rather than CSV.

Using Raku (formerly known as Perl_6)

``````~\$ raku -MText::CSV -e 'my @a = csv(in => \$*IN, sep => ";");
@a = @a>>.map( {sprintf "%.2d", \$_ } );
csv(in => @a, out => \$*OUT, sep => ";");'  file
``````

The code above uses Raku’s `Text::CSV` module. In the first statement the CSV file is read-in, setting the `sep` parameter to accept `;` semicolons as column separators. In the second statement the `sprintf` directive `"%.2d"` sets each column to a minimum width of 2, zero-padded. In the output `;` semicolons are restored as column separators, otherwise the default will be used (comma).

The above answer holds for CSV files with every column being integer. So far unsigned integers are `NYI` (not-yet-implemented). It might be preferable to use the string form of `sprintf`, taking care to add `0` to the formatting code, which directs padding with zeros insted of blank spaces. Alternate second statement below:

`@a = @a>>.map( {sprintf "%02s", \$_ } );`

Sample Input:

``````1664;4;5;35;37;43;5;6
1663;21;23;32;40;49;8;11
1662;16;17;34;35;44;5;10
1661;2;9;23;32;40;6;7
1660;23;25;30;44;47;9;12
1659;3;5;9;32;43;6;10
1658;4;6;10;13;34;3;5
1657;8;9;33;35;40;3;6
1656;15;20;31;44;48;1;3
1655;25;27;35;40;45;7;11
1654;7;32;33;34;38;6;9
1653;5;7;11;27;37;6;12
1652;7;31;33;35;36;7;10
1651;4;12;34;35;45;1;9
1650;5;8;29;35;48;5;6
1649;2;11;28;42;48;4;9
1648;2;11;12;19;38;4;8
``````

Sample Output (either example above):

``````1664;04;05;35;37;43;05;06
1663;21;23;32;40;49;08;11
1662;16;17;34;35;44;05;10
1661;02;09;23;32;40;06;07
1660;23;25;30;44;47;09;12
1659;03;05;09;32;43;06;10
1658;04;06;10;13;34;03;05
1657;08;09;33;35;40;03;06
1656;15;20;31;44;48;01;03
1655;25;27;35;40;45;07;11
1654;07;32;33;34;38;06;09
1653;05;07;11;27;37;06;12
1652;07;31;33;35;36;07;10
1651;04;12;34;35;45;01;09
1650;05;08;29;35;48;05;06
1649;02;11;28;42;48;04;09
1648;02;11;12;19;38;04;08
``````

Input/Output: You can redirect output to a new file taking the excellent code from either @AdminBee or @roiama (or both). If you want to direct Input/Output from within-the-code itself, the `\$*IN` and `\$*OUT` dynamic values can be replaced by PATHs (see second URL below for details).

Super crude.
Super slow.
May be useful.

Using any text editor –
replace all occurrences of ",1," with ",01," etc

Single digits at the end of line need to eg
replace all occurrences of ",3[EOL identifier]" with ",03[EOL identifier]"