how query by column number with awk
I would like to get time difference by column number 2 delimited by ,
for below command. It works fine by default for the first column.
table:
20220505,12:00:12,user,dept
20220505,12:00:25,user,dept
20220505,12:00:32,user,dept
command:
awk -F: '{t=$1*3600+$2*60+$3} NR>1{printf( "%.6fn", t-p)}{p=t}' inputfile
I tried: but failed to get correct result.
awk -F ',' $2 '{t=$1*3600+$2*60+$3} NR>1{printf( "%.6fn", t-p)}{p=t}' inputfile
You need to tell awk to use both ,
and :
as field separators. Note that this will change the numbering of the fields (because both field separators are applied to each input line). For example:
$ awk -F '[:,]' '{ t = $2*3600 + $3*60 + $4 }
NR > 1 { printf "%.6fn", t-p }
{ p=t }' inputfile
13.000000
7.000000
Note that this does not calculate time differences across different days. To do that, you’d need to parse the combined date and time into a time_t
value (i.e. seconds since the epoch, Midnight on Jan 1 1970). For example, with GNU awk, using the mktime()
function:
$ awk -F '[:,]' '{
# extract the year, month, and day from the first field
y = substr($1,1,4);
m = substr($1,5,2);
d = substr($1,7,2);
# create a space-separated datespec string, combining y,m,d with
# hours, minutes, and seconds from fields 2, 3, and 4
datespec = y " " m " " d " " $2 " " $3 " " $4;
# convert it to seconds since the epoch
t = mktime(datespec)
}
NR > 1 { printf "%.6fn", t-p }
{ p=t }' inputfile
13.000000
7.000000
Run man awk
and search for mktime
for a summary on how this function works.
Using Raku (formerly known as Perl_6)
raku -e 'my @a = lines.map: *.split(","); my @b; for ^@a.elems -> $i {
my $date = @a[$i][0].subst(/ ^ (d**4)(d**2)(d**2) $/, {"$0-$1-$2"});
@b.push: "{$date}T@a[$i][1]Z".DateTime.Instant };
loop (my $i = 0; $i < @b.elems-1; $i++) { put @b[$i+1] - @b[$i] };'
Raku handles dates and times using the ISO 8601
interchange standard, built-in. In the first statement, lines
are read in, each split
on ","
commas, and saved in @a
array. Array @b
is declared. Then lines are looped over, with elements of the first column (dates) being subst
such that a -
hyphen separated $date
is created, and this string is used to generate an ISO 8601 DateTime
like 2022-05-05T12:00:12Z
. This DateTime
is immediately converted to an Instant
object (in seconds) that is push
ed onto the @b
array. In the final statement, Instant
seconds are loop
ed over, and subtracted from each other.
Sample Input 1:
20220505,12:00:12,user,dept
20220505,12:00:25,user,dept
20220505,12:00:32,user,dept
Sample Output 1:
13
7
Below the advantage of using ISO 8601
dates and times: timestamps that cross from one date to the next are properly handled.
Sample Input 2:
20220505,23:59:45,user,dept
20220506,00:00:00,user,dept
20220506,00:00:15,user,dept
Sample Output 2:
15
15
https://en.wikipedia.org/wiki/ISO_8601
https://docs.raku.org/language/temporal#index-entry-Date_and_time_functions
https://raku.org