How can I optimize the transfer of files between two systems and also trim the file

I have computer 1 logging voltage data to a file volts.json every second.

My second computer connects via ssh and grabs that file every 5 minutes. Splunk indexes that file for a dashboard.

Is scp efficient in this manner, if so then ok. Next is how to manage the file and keep it small without growing to 2mb lets say? is there a command to roll off the earlier logs and keep the newest?

the json looks like this right now:

{
  "measuredatetime": "2022-06-27T18:00:10.915668",
  "voltage": 207.5,
  "current_A": 0.0,
  "power_W": 0.0,
  "energy_Wh": 2,
  "frequency_Hz": 60.0,
  "power_factor": 0.0,
  "alarm": 0
}
{
  "measuredatetime": "2022-06-27T18:00:11.991936",
  "voltage": 207.5,
  "current_A": 0.0,
  "power_W": 0.0,
  "energy_Wh": 2,
  "frequency_Hz": 59.9,
  "power_factor": 0.0,
  "alarm": 0
}
Asked By: Zippy

||
  • To keep directories synchronized through ssh,the typical tool is rsync.
  • To roll log files and save space, logrotate is well dedicated.
  • To secure an unattended simple task through ssh, .ssh/authorized_keys with forced command is an excellent practice.

Example:

  • set /etc/logrotate.d/volts file (imitate classical syslog settings)

  • create a task-dedicated key pair with ssh-keygen; in this particular case, you do not want a passphrase; security is ensured by autorized_keys restrictions

  • in .ssh/authorized_keys, set:

    command="rsync --server --sender -logDtpre.iLsf . /path/to/volts/" ssh-rsa AAAAB3NzaC1yc2E[...pubkey...] blabla
    
  • on the other side, in crontab, set

    rsync -e "ssh -i /path/to/privatekey" -a otherhost:/path/to/volts/ /path/to/volts
    

On computer 1, you could also replace the log file by a named pipe, make a daemon script that consumes the stream and writes safely to a file (e.g using a semaphore to manage concurrent I/O), so that you have a good control over the data integrity.

Answered By: Thibault LE PAUL
Categories: Answers Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.