[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnubol: Re: which parser tool
In a message dated 11/24/99 1:02:13 PM EST,
the ever arduous Tim Josling ,TIMJOSLING@prodigy.net writes:
<<
The problem with shuffling the tokens around, as with renaming tokens and
adding
extra fake tokens (eg to delimit statements) is that it messes up the error
messages. You can do this better IMHO by changing the token types.
>>
This was in response to a suggestion of reversing the SECTION token and a
section-name-declarative token.
So now for a tangent. Error processing is asynchronous, IMHO. We stitch line
numbers onto emisions sent in through the error process API. We emit errors
in any order, no problem. Some magic thing somewhere straightens it all out.
A linked list, ordering by line number within a single listing or within a
file ID. Or a sort. Syntax and semantics care not about the eventually
sequence. Do any parse in any order, upside down if you like, inside out if
that suits your fancy. The error process API does not care.
In the possible case where the SECTION token and the section name are on
different lines, we are not concernt that the tokens bubble up transposed
(assuming we do consider that possibility). The tokens need to be line
number stamped. If we chose to not burden the lexer/parser interface with
cycle costly data structures; as has been discussed; the proposed hack then
needs to juggle any line number data item and an alias, ... which juggling
only takes effect in limited cases.
The reason for this whole transposition suggestion, is that so much of the
topology of the language has leading delimiters (mostly the verbs), that
helps. These drive lookahead trigger of reductions (where we can use
lookahead).
The SECTION headers are reversed. Actually in a certain sense so are the
DIVISION headers, but they each start with a unique tooken, which hurries
reductions along just as well. The section names are not as obviously unique
tokens ... but actually only if other recommendations are not headed.
We could really use this hack if we insist on the lexer talking merely
PROG_NAME to the parser(s). That basic idea is still in place, as
interactions veer back and forth about how big a role syntax plays in
validation. If it is minor, and semantics' role is big, I will not be able
to justify manifesting reference type with elaborate precision inbound to the
parser(s).
If there is no buy off on this kind of precision, then PROG-NAME could be a
declaration or a reference. That, though it takes a minute to get to it, is
the problem that SECTION token transposition can alieviate some.
When we have a damaged paragraph ending scope bleeds forward, possibly, to
the next section. If the next thing is the PROG-NAME that is the begining of
the SECTION it will look a lot like a data name to be glued somehow to what
preceeded. This is just a revisitation of the margin and paragraph name thing
that has been discussed.
If a preceeding paragraph does not have a period end then the next paragraph
name looks like a glue on if it is nothing more precise than PROG-NAME. So
too for section names. But here we can reverse the tokens, possibly, and it
looks different. The SECTION token will have high precedence, and I expect
right associativity, which we can not impart to PROG-NAME.
However if we do the work that I propose up front the lexer, or a tiny
filter, can intercept PROG-NAME and convert it to section-name-declarative,
which token can have the precedence and associativity needed to break current
scopes. That, if you think it through is the same thing, we just borrowed the
attribute by means of the snooping I propose in the preprocess.
But still, if we eventually realize that the section-name-declarative type of
token can not do all we need, reversing these tokens may be useful to drive
lookahead mechanics.
I am not sure what either strategy, normal sequence, or transposition, does
for broken section declarations where the section-name-declarative is
missing.
Something like
END-previous.
SECTION.
This too can probably be detected by the preprocessor snoop, and maybe even
instantiated in the symbol table, quite possibly as a duplicate, if the coder
is the habitual type.
What do you think the lexer to parser stream should look like in that case?
Best Wishes
Bob Rayhawk
RKRayhawk@aol.com
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.