Convert formatted dates to seconds since the epoch

I have a file:

pablo tty8 Thu Nov 1 12:51:21 2012 still logged in 
(unknown tty8 Thu Nov 1 12:50:57 2012 - Thu Nov 1 12:51:21 2012 (00:00) 
pablo tty2 Thu Nov 1 12:50:39 2012 still logged in 
pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) 
(unknown tty7 Thu Nov 1 12:34:32 2012 - Thu Nov 1 12:49:45 2012 (00:15)

I want to replace the file in the above date for a second. I want to print:

pablo tty8 1351770681 still logged in 
(unknown tty8 1351770657 - 1351770681 (00:00) 
pablo tty2 1351770639 still logged in 
pablo tty7 1351770585 - 1351770656 (00:01) 
(unknown tty7 1351769672 - 1351770585 (00:15)

I tried this command:

gawk --posix 'function my()
{"date -d 47"$0"47 +%s" | getline b; 
gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b );print}
{ my() }' file

The above command does not work:

$ gawk --posix 'function my()
> {"date -d 47"$0"47 +%s" | getline b; 
> gsub( /[A-Za-z]{3} [A-Za-z]{3} [0-9] ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/,b ); print}
> { my() }' ta
date: błędna data: `pablo tty8 Thu Nov 1 12:51:21 2012 still logged in '
pablo tty8  still logged in 
(unknown tty8 1351897200 - 1351897200 (00:00) 
date: błędna data: `pablo tty2 Thu Nov 1 12:50:39 2012 still logged in '
pablo tty2 1351897200 still logged in 
date: błędna data: `pablo tty7 Thu Nov 1 12:49:45 2012 - Thu Nov 1 12:50:56 2012 (00:01) '
pablo tty7 1351897200 - 1351897200 (00:01) 
(unknown tty7 1351897200 - 1351897200 (00:15)

How to improve the above command?

Asked By: nowy1

||

Here’s an alternate approach (using mktime):

#!/bin/awk -f
{
    split($6,A,":");
    S1=sprintf("%d %d %d %d %d %d",$7,$4,$5,A[1],A[2],A[3])
    T1=mktime(S1)
    if ($8=="-") {
        split($12,A,":");
        S2=sprintf("%d %d %d %d %d %d",$13,$10,$11,A[1],A[2],A[3])
        T2=mktime(S2)
        print $1,$2,T1,$8,T2,$14
    }
    else {
        print $1,$2,T1,$8,$9,$10
    }
}
Answered By: JRFerguson

You could do it like this with GNU sed:

convert_date.sed

: a
s/(([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4})(.*)/n4n1/
h
s/.*n//
s/^/date -d "/
s/$/" +%s/e
G
s/([^n]+)n([^n]+)n([^n]+)n.*/213/
/([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/ta

Run it like this:

sed -rf convert_date.sed infile

Output:

pablo tty8 1351770681 still logged in 
(unknown tty8 1351770657 - 1351770681 (00:00) 
pablo tty2 1351770639 still logged in 
pablo tty7 1351770585 - 1351770656 (00:01) 
(unknown tty7 1351769672 - 1351770585 (00:15)

Explanation

This may look a bit daunting at first, but the idea is not that complicated. This regular expression, ([A-Za-z]{3} ){2}[0-9]{1,2} ([0-9]{2}:){2}[0-9]{2} [0-9]{4}, which occurs in the first replace and the conditional at the end, matches the date type used in the input, it captures and isolates the date. The surrounding bits are stored in the hold space while date -d is run on the captured date. Finally, all the bits are collected in pattern space and reorganized into the correct order.

The conditional at the end repeats the process if any dates remain in pattern space.

Answered By: Thor

With perl and its Date::Manip module:

perl -MDate::Manip -pe '
  s/w{3} w{3} +d+ dd:dd:dd d+/
  UnixDate ParseDate("$&"),"%s"/ge'
Answered By: Stéphane Chazelas

To do it your way, that would have to be something like:

POSIXLY_CORRECT=1 awk '
  {
    n = ""; r = $0
    while (match(r, /[[:alpha:]]{3} [[:alpha:]]{3} +[0-9]+ ([0-9]{2}:){2}[0-9]{2} [0-9]{4}/)) {
      c = "date -d"" substr(r,RSTART,RLENGTH) "" +%s"
      c | getline b
      close(c)
      n = n substr(r,1,RSTART-1) b
      r =  substr(r,RSTART+RLENGTH)
    }
    print n r
  }'
Answered By: Stéphane Chazelas

The Perl solution provided by Stephane requires a non-core Perl module. One could use the core module (since 5.10), Time::Piece, similarly:

#!/usr/bin/env perl
use strict;
use warnings;
use Time::Piece;
my $t = Time::Piece->new;
while (<>) {
    s{w{3}s(w{3}sd{1,2}sdd:dd:ddsd{4})}
        {$t=Time::Piece->strptime($1,"%b %d %H:%M:%S %Y");
        sprintf "%s",$t->epoch}ge;
    print;
}
Answered By: JRFerguson
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.