[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: procedure types




There was some chatter about ...
 > There is no design or current code to this effect, but discussion seems to
 > guess that the preprocessor could be the separator detector.

to which Tim Josling, tej@melbpc.org.au writes

In a message dated 11/13/99 10:46:07 PM EST, tej@melbpc.org.au writes:

<< 
 I wouldn't see the preprocessor doing this, but an early phase of the
 compiler proper. Create all the tokens in a linked list and do a simple
 pre-scan, adding a few markers to the list to help the main parsers, and
 to separate the divisions. I am working on this at the moment.
 >>

Well that position is clear enough and sounds like desing to me. Yet I would 
point out that the preprocessor is the first point at which all code is 
gathered, and it traffics somewhat in tokens to dig in after the COPY 
machinery.  Once code is expanded the text is full. 

If the DIVISION, SECTION and paragraph labels are there, then there we have 
our first opportunity.  In a way the preprocessor is two passes in one.  It 
passes the original unexpanded source (that is, before COPYing), and by 
possessing the expanded code line by line before sending to output it has the 
whole stream in hand.

So in a design sense, the first of these described passes actually can't see 
the whole program; put the output buffers sees all.  Or atleast it could.  At 
that point we are past lexing it in the preprocess sense. So really its a new 
ballgame instant by instant as the records stream out.  Why should those 
output records be sacred? I am not saying to alter them or in anyway mess 
with the preprocessed code, but lets not be blind to it either.  After all it 
is really a free pass - we do not have to read it, we already have it ... 
line by line.

So imagine a snoop. It looks at the output buffer. The rest of the 
preprocessor really does not
even need to know it is there; plug it in at the newline event on each out 
going record, one tiny fraction of a second before it slips away.

Could be simple, maybe. Just need logic to look for D,S,P's.

There is a way to do this.  It is easy really.  Real easy.  Imagine your 
current program. Now imagine it is all a submodule.  It is called by the 
thing that wants to find D,S,P''s. So, piece of cake! Rather than write 
records out return the records to this thing hungry for D,S,P's. When it has 
satiated itself from your current line, it calls you again.

So rather than think of writing as a function that calls down to a subroutine 
to write out, just do a _return_!. You could change the code just that far, 
to see what I mean. Just invert the relationship between all of your code and 
the lowly physical write.  When your own code would iterate to an end, as in 
when the lexer sends you EOF, just return to the new high level module with 
EOF. Do you see how that inverted structure now looks like it might have a 
parser riding the outputstream?

You can do this and just put the write statement in this high level module, 
and not theorize about it. It would have a main that calls all of the current 
preprocessor and gets control back 
at the point of what you currently think of as your write, now the top module 
does the write and loops again into your preprocessor.

So then image you go wild and hand code a parser scanner dissector thing up 
in that module and the whole world comes to your front door and ask why are 
you doing that? You should replace your new module with a lexer (suitably 
monkying around with its YYINPUT or whatever for the selected tool), have 
that lexer knee-jerk that record directly to disk for safekeeping, but to 
commence to lex it mildly to discover just what we need here, and pass it up, 
... now get this of all revolutionary ideas, ... token by token ... can you 
just imagine? ... to a parser, ... land sakes child! And that parser only 
needs to here about things that could be parts of D,S,Ps and its interface 
has some linenumber info from the lexer.

((If you don't want to go over the deep end and build the whole compiler 
there, just percolate up what we need.  Stateed differently, if you find PIC 
or maybe ASSIGN, don't pass it up. etc.))

Well there you have it. If the lexeme flow conforms to a few simple rules 
seeking division, section and paragraph labels, then so note.   Now: 'so 
note' actually needs to be clarified.
'So note' is the interface to the parser. Its a work sheet.  More anon, if 
you like.

From my first glance at the Preprocessor, it is not so extensive as to 
preclude the feasibility of inversion at this stage. Early experiments can 
just get the I/O action up top. Once you see how easy it would be to do that, 
... actually this is neat, ... the ease of this situation is categorically 
due to the functional characteristics of C, ... so as much as we all love 
COBOL don't ever let anyone get away with dissing C, ... (I need a brand new 
punctuation mark to designate asides, can you code that too?), .. so anyway 
once you see how easy it is to invert the I/O function, it will be obvious to 
you that there is definitely no need to read those records back in, we do not 
need a pass! That is a point worth getting to. No matter how urgently we 
desire to stamp the preprocess as EOJ, it ain't EOJ.

I see no reason to let go of a perfectly good record before we are through 
with it.

I respect your well considered posts, yet I maintain that the function of 
'separator detector' is high priority in the sequence of events even with a 
flatlander parse. Parallel parse certainly can't get started without the work 
sheet. 

Where you say " I wouldn't see the preprocessor doing this, but an early 
phase of the  compiler proper. "  I would say that the preprocessor IS the 
early phase of the compiler proper. I/O is extremely expensive, why let go of 
a record you need? Gee after all the hard work!

What further design ideas do you have in this area? How would you list the 
pieces of the interface from preprocess to the parer(s).

-Bob Rayhawk
 






















--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.