[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: Re: which parser tool



In a message dated 11/24/99 1:02:13 PM EST,
the ever arduous Tim Josling ,TIMJOSLING@prodigy.net writes:

<< 
 The problem with shuffling the tokens around, as with renaming tokens and 
adding
 extra fake tokens (eg to delimit statements) is that it messes up the error
 messages. You can do this better IMHO by changing the token types.
  >>

This was in response to a suggestion of reversing the SECTION token and a 
section-name-declarative token.

So now for a tangent. Error processing is asynchronous, IMHO. We stitch line 
numbers onto emisions sent in through the error process API.  We emit errors 
in any order, no problem. Some magic thing somewhere straightens it all out. 
A linked list, ordering by line number within a single listing or within a 
file ID. Or a sort. Syntax and semantics care not about the eventually 
sequence. Do any parse in any order, upside down if you like, inside out if 
that suits your fancy. The error process API does not care.

In the possible case where the SECTION token and the section name are on 
different lines, we are not concernt that the tokens bubble up transposed 
(assuming we do consider that possibility).  The tokens need to be line 
number stamped. If we chose to not burden the lexer/parser interface with 
cycle costly data structures; as has been discussed; the proposed hack then 
needs to juggle any line number data item and an alias, ... which juggling 
only takes effect in limited cases.

The reason for this whole transposition suggestion, is that so much of the 
topology of the language has leading delimiters (mostly the verbs), that 
helps. These drive lookahead trigger of reductions (where we can use 
lookahead).

The SECTION headers are reversed.  Actually in a certain sense so are the 
DIVISION headers, but they each start with a unique tooken, which hurries 
reductions along just as well. The section names are not as obviously unique 
tokens ... but actually only if other recommendations are not headed.

We could really use this hack if we insist on the lexer talking merely 
PROG_NAME to the parser(s).  That basic idea is still in place, as 
interactions veer back and forth about how big a role syntax plays in 
validation.  If it is minor, and semantics' role is big, I will not be able 
to justify manifesting reference type with elaborate precision inbound to the 
parser(s).

If there is no buy off on this kind of precision, then PROG-NAME could be a 
declaration or a reference.  That, though it takes a minute to get to it, is 
the problem that SECTION token transposition can alieviate some.

When we have a damaged paragraph ending scope bleeds forward, possibly, to 
the next section.  If the next thing is the PROG-NAME that is the begining of 
the SECTION it will look a lot like a data name to be glued somehow to what 
preceeded. This is just a revisitation of the margin and paragraph name thing 
that has been discussed. 

If a preceeding paragraph does not have a period end then the next paragraph 
name looks like a glue on if it is nothing more precise than PROG-NAME.  So 
too for section names. But here we can reverse the tokens, possibly, and it 
looks different. The SECTION token will have high precedence, and I expect 
right associativity, which we can not impart to PROG-NAME.

However if we do the work that I propose up front the lexer, or a tiny 
filter, can intercept PROG-NAME and convert it to section-name-declarative, 
which token can have the precedence and associativity needed to break current 
scopes. That, if you think it through is the same thing, we just borrowed the 
attribute by means of the snooping I propose in the preprocess.  

But still, if we eventually realize that the section-name-declarative type of 
token can not do all we need, reversing these tokens may be useful to drive 
lookahead mechanics.

I am not sure what either strategy, normal sequence, or transposition, does 
for broken section declarations where the section-name-declarative is 
missing. 
Something like
     END-previous.
SECTION.
 This too can probably be detected by the preprocessor snoop, and maybe even 
instantiated in the symbol table, quite possibly as a duplicate, if the coder 
is the habitual type.

What do you think the lexer to parser stream should look like in that case?

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com

















--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.