[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: gnubol: Record delimiter clause and parse order
Caveat: I'm not a Unix guy. I'm commenting only as a Cobol guy (IBM
Mainframe, HP 3000, TI 990, PC (Realia & Micro Focus, both under OS/2 and
WIndows), System 34, System 36, System 38, Unisys, but no Unix). I'm a bit
obsessive/compulsive so I own personal copies of the 74 standard, the 85
standard and the proposed 2002 standard. Take all of these comments in that
light.
=====[begin]=====
I was toying with both ideas. How portable would using a default
delimiter, say, '\n', be? I mean, you conceptually expect records to be
on successive lines, anyhow. But it may not be portable. I can't
really find a definitive description of the structure of a sequential
file, besides Microfocus' specification for "line sequential"--which in
itself is not portable between Unix and Windows, for example. MF says a
"line sequential" file is a file which could have been created with a
text editor, with each record on a successive line. Now, if you create
the file in something like, oh, Notepad, the lines will be delimited by
"\r\n", whereas on Unix you will only have "\n" I don't know how much
of an issue, though--guess it depends on how many "\r\n" systems that
this will be used on.
My personal vote is for reading up until a "\n". I would guess that
this is pretty much implementation-dependent anyhow. The question would
remain, however--what to do with all that MF code that uses "line
sequential"--treat that the same, or what?
=====[end]=====
My suggestion:
(1) Handle the fixed length in addition to the delimited variable length
read as fixed. It permits writing/reading files without delimiters.
(2) Handle the \r\n & \n as the same. (My non-Unix background is showing
here, I'd call them CR-LF and LF). My recollection is that Micro Focus will
handle either.
In other words, my personal suggestion is to handle bothI think you need to
be able to handle either.
=====[begin]=====
> Now consider a "variable length" file. I use the "record length varying .
.
> . depending on" clause to identify a Working Storage variable to contain
the
> record length. On a read, this variable should be set by the file system,
> after the read, to contain the length of the record actually read.
This is good, for *after* a read. But how do you know how many bytes to
read *before* the read? There must be some delimiter to mark the end of
a record, otherwise there's no way to know when you've read in the
totality of the record, until you reach the max. number of bytes for the
record definition.
=====[end]=====
The "count" approach is not really suitable for a PC/Unix environment. Such
environments do not have a file system to handle such issues. The delimiter
is much better.
For the PC, a "line sequential" fixed and a "line sequential" variable, look
the same physically. The difference is what happens to the "record" during
the read. If I am reading a fixed length file of 80 bytes as "line
sequential" and it only has "20 characters" I would expect the remaining 60
bytes to be spaces in the record inside the program. If I am reading a
variable length record (of between 1 and 80 bytes) and i get only 20, I
would expect the remaining 60 bytes to be unchanged from whatever they were
before the read). It is a subtle but important difference.
In summary, my personal suggestions would be to support a delimiter for
variable and both a delimiter and a record length for fixed.
James S. Huggins
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.