[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: Record delimiter clause and parse order



GNUCobol@ZName.com wrote:
> 
> COBOL implementations frequently implement sequential IO in a variety of
> ways.
> 
> For example, on IBM Mainframes, "record length" is a fundamental component
> of the file specification carried within the file system. Asking the file
> system for the next record will get you the next record without you needing
> to specify length.
> 
> On PCs, there is no file system specification of record length. Rather a
> record is an arbitrary cluster of sequential bytes contained within the long
> concatenated string of bytes we call a file.
> 
> Within COBOL designed to operate in the PC environment, there are a couple
> of issues.
> 
> Consider "fixed length" files first: I'll define the file with 80 byte
> records.
> 
> When I do a sequential read, what happens?
> 
> Option one: give me the next 80 bytes.
> 
> Option two: give me bytes until you hit a delimiter, then "fill" the
> remainder of the record.

I was toying with both ideas.  How portable would using a default
delimiter, say, '\n', be?  I mean, you conceptually expect records to be
on successive lines, anyhow.  But it may not be portable.  I can't
really find a definitive description of the structure of a sequential
file, besides Microfocus' specification for "line sequential"--which in
itself is not portable between Unix and Windows, for example.  MF says a
"line sequential" file is a file which could have been created with a
text editor, with each record on a successive line.  Now, if you create
the file in something like, oh, Notepad, the lines will be delimited by
"\r\n", whereas on Unix you will only have "\n"  I don't know how much
of an issue, though--guess it depends on how many "\r\n" systems that
this will be used on.

My personal vote is for reading up until a "\n".  I would guess that
this is pretty much implementation-dependent anyhow.  The question would
remain, however--what to do with all that MF code that uses "line
sequential"--treat that the same, or what?


> 
> Now consider a "variable length" file. I use the "record length varying . .
> . depending on" clause to identify a Working Storage variable to contain the
> record length. On a read, this variable should be set by the file system,
> after the read, to contain the length of the record actually read.
> 

This is good, for *after* a read.  But how do you know how many bytes to
read *before* the read?  There must be some delimiter to mark the end of
a record, otherwise there's no way to know when you've read in the
totality of the record, until you reach the max. number of bytes for the
record definition.

> On a PC this can be accomplished in two ways:
> 
> (1) use of a delimiter,
> 
> (2) use of record length counts (similar to the IBM mainframe imlementation
> of variable length  records)
> 
> If all of this confuses more than it helps, my apologies.
> 

I like the default delimiter idea, myself.  Makes life much easier. 
I've a bunch of sequential files here, but unfortunately I ftp'ed them
down, which conveniently converts text files to the local format. 
Still, since they all came from the mainframe, and since they are all
currently delimited by a "\n", I think it's safe to say that we can use
"\n" as a delimiter, or maybe "\n" on *nix, and "\r\n" on systems that
use "\r\n".

ciao,
-- 
Matthew Vanecek
Visit my Website at http://mysite.directlink.net/linuxguy
For answers type: perl -e 'print
$i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);'
*****************************************************************
For 93 million miles, there is nothing between the sun and my shadow
except me. I'm always getting in the way of something...

--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.