When does OFS change?

Except when set by the user or an user script, the value of OFS seems to always be an space. Even when the input use tabs:

$ printf 'onentwotthreenfour' | awk '{NF=NF}1' | sed -n l
one$
two three$
four$

However, for a similar variable (ORS) it is sometimes said that: leave ORS alone so it retains whatever value it’s supposed to have for your platform. I could imagine that on some platforms the default ORS could be rn. It may seem sensible on some platforms.

The question for OFS then is:

  • Does OFS change between platforms ?.
  • Is there some implementation where OFS is not an space ?

Edit Comment: Sorry for any confusion that my question may have generated, I hope it is clear now.

No able to up-vote (yet).

Asked By: QuartzCristal

||

As I (now) commented there, ORS always defaults to "n", but on Windows the C implementation (which applies to many other programs in addition to awk) translates n to and from CR LF — for all n characters regardless of whether they are produced from ORS, or matched to RS on input.

Similarly, yes OFS always defaults to (one) space. FS also defaults to one space, but is handled specially: when FS equals one space, either by default or explicit setting, fields are actually split by any sequence of whitespace (including a tab in the case you posted). Any other single-character FS is treated as a literal character, and any multi-character value as a regexp.

Thus for the single-space or regexp cases, a record may be split at delimiters which vary from field to field and/or record to record, but if you rebuild $0 by assigning to either NF as you did or any field, the rebuilt line uses the fixed value of OFS between all fields (if more than one). Also, if you use print x,y,z with multiple expressions, they are separated by (fixed) OFS. And of course if you explicitly print or otherwise use a string expression containing (or consisting of) OFS, you get the value of OFS.

Standardly RS can be set to any single character, or the empty string to cause ‘paragraph’ mode where records are split by one or more empty lines (consecutive newlines); in this mode by default it splits fields at newline in addition to the normal case. In GNU awk only, RS can be set to a multi-character regex, and the match result is available in RT. See the summary at the bottom of this page in the GNU doc.

Answered By: dave_thompson_085
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.