Compare multi column files with awk and print output
File1:
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0701_DS005 2022-06-02 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0701_DS005 2022-06-03 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0701_DS005 2022-06-04 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0701_DS005 2022-06-05 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0701_DS005 2022-06-06 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0701_DS005 2022-06-07 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS006 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0701_DS006 2022-06-01 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0702_DS006 2022-06-02 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0702_DS006 2022-06-03 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0702_DS006 2022-06-04 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0702_DS006 2022-06-05 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0702_DS006 2022-06-06 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS006 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0703_DS006 2022-06-07 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0703_DS007 2022-06-01 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0703_DS007 2022-06-02 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0703_DS007 2022-06-03 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0703_DS007 2022-06-04 00:00:00
File2:
cfpur0701 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0701 hostgroups-1d1w 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0702 hostgroups-1d1w 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0703 hostgroups-1d1w 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0801 hostgroups-1d1w 2022-06-04 00:00:00 2022-06-11 00:00:00
DESIRED OUTPUT:
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0701_DS005 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0701_DS005 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0701_DS005 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0701_DS005 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0701_DS005 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS005 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0701_DS005 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0701 Pgroup F30109 cf3010922 F30109_FPUR0701_PRD03E_DS006 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0701_DS006 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0702_DS006 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0702_DS006 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0702_DS006 2022-06-04 00:00:00 2022-06-11 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 286 F30109_THK_T0_CFPUR0702_DS006 2022-06-05 00:00:00 2022-06-12 00:00:00
cfpur0702 Pgroup F30109 cf3010922 F30109_FPUR0702_PRD03E_DS006 hostgroups-1d1w 287 F30109_THK_T0_CFPUR0702_DS006 2022-06-06 00:00:00 2022-06-13 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS006 hostgroups-1d1w 288 F30109_THK_T0_CFPUR0703_DS006 2022-06-07 00:00:00 2022-06-14 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 282 F30109_THK_T0_CFPUR0703_DS007 2022-06-01 00:00:00 2022-06-08 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 283 F30109_THK_T0_CFPUR0703_DS007 2022-06-02 00:00:00 2022-06-09 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 284 F30109_THK_T0_CFPUR0703_DS007 2022-06-03 00:00:00 2022-06-10 00:00:00
cfpur0703 Pgroup F30109 cf3010922 F30109_FPUR0703_PRD03E_DS007 hostgroups-1d1w 285 F30109_THK_T0_CFPUR0703_DS007 2022-06-04 00:00:00 2022-06-11 00:00:00
The Number of lines in file1 will always be more than file2. I need to do the following:
Compare column 1 in both file1 & file2 and if they match then Compare column 6 in file1 with column 2 in file2 and finally compare column 9 in file1 with column3 in file 2. When all 3 conditions are met, grab the time stamp ( column 5 & 6 ) from file2 and append it to file1 resulting in the output file.
I tried couple of versions of awk with NR==FNR
and haven’t made much progress.
awk 'FNR==NR{map[$1,$2,$3,$4] = $5 FS $6; next}
($1,$6,$9,$10) in map {print $0,map[$1,$6,$9,$10]}' file2 file1
I used both date and time fields into the mapping, as I think you need it (match with both last two columns of file1)
At first file pass (FNR==NR
) we store into the usual array the fields to append (last 2) and the first 4 are used as the hash. next
is necessary to avoid running rest of the code for any line of first file.
For the second file, we append the array value and print the row, only when the selected fields form an existing array hash.