[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnubol: The token "IS"
In a message dated 12/7/99 8:51:17 AM EST, mck@tivoli.mv.com writes:
<< Can anyone think of what the repercussions would be if we were to lex
out the token "IS" as noise? I don't remember ever having done it,
but I can't think of anyplace that would be affected by it's
absence. Getting rid of it would reduce the lookahead burden
slightly and simplify parts of the grammar.
Thanks,
Mike
>>
A thought about that. In looking at the arithmetic conditionals it naturally
occurs to us that it would be nice to lex out the optional ON as noise. This
is very useful as a means to avoid rules with epsilons or the dreaded
explication of alternate rules with and with out the token; all to aboid r/r
conflicts. One problem that creates is that you then become blind to
duplicates. As in ON ON, or IS IS. And you commence to consider yet more work
in the lexer to infact catch those.
Also I will reapeat a comment someone else made when there was a brief
discussio about folding togeter things like SPACE/SPACES, ZERO/ZEROE/ZEROES
and naturally
also IS/ARE.
There is a mild concern about ungrammaticalness then possible, but I think
more dramatically one statement does not allow the substitution. There the
particle is maybe still optional! You just have to be grammatical :-).
Thus are you not then going to have to consider filtering out IS _and_ ARE.
And does not your requirement for special lexer task become detecting IS ARE
and ARE IS as those are not visible to the parser? A large part of the world
does not care. But standards compliance is another issue. Can you be
compliant and not detect IS IS as an error?
Balanced against this is the question of what to do with it in the parser
anyway if you can see these dups. Everytime I post a comment about flyouts
from lower rules to higher rules, I get clobber by more experienced people to
whom I am happy to defer. None the less, even though you at first glance
probably don't want to see that noise suppression could raise the issue of
dup detection in the lexer; there may actually be an entire separate
justification for it.
It would be just absolutely excellent, for example to get the ON token
completely out of the arithmetic rules. It would be nice to get ON ON
detection out of there as well. I am thinking that the method prefered is to
not code for it, let the rule break and it will be caught nearby with ease.
Alternatively you can subjugate it with a rule unto itself, which will be
arcame unto itself to avoid epsilon. But if you get into the business of
suppressing IS and/or ARE, and address the duplicate issue in the lexer, then
a style of doing business has been established that relates to some of the
epsilons versus alternate rule explication up in the parser on other optional
tokens.
The problem we have is that as soon as anything looks like it is getting
bigger, it tends to look wrong to an understaffed project.
So even if you don't want to discuss it as a generalizable idea. I think you
atleast need to contemplate the IS IS detection problem, and the more isolate
IS/ARE asymetry.
Bob Rayhawk
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.