[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: problem 2.



In a message dated 12/31/99 3:45:56 AM EST, wmklein@ix.netcom.com writes:

<<  wish I really understood this "stuff" - but does
>>
snips 
 << 
 Handle reference modification using arithmetic expression, e.g.
 
     data-name (data-name2 + 2:) *> reference modification
         versus
     data-name (data-name2 +2) *> two subscripts
         versus
     data-name (index-name + 2) * relative indexing
 
 it may be in those rules, but I didn't see it there (in a form that I could
 pick up)
  >>

No. And good point. But actually many things are missing. The code is 
intended to show how to disambiguate, with the technique of setting up rules 
that are, what I call, 'hardened' on the right side. The posted ref_data  
rules are not intended as complete. The rules for aritmetic operations do not 
have an alternate literal to compete for the references to data, either.

The 'stuff' is easy.  I think you know that bison grammar rules are like make 
file commands. They are not really sequential commands, but a set of rules 
with a left-hand side  (lhs) that gets discovered when the parts listed in 
one of the right-hand side (rhs) patterns is matched to input in the source 
code (of, say, a COBOL program).

For example,

lhs : ADD ref_data TO ref_data

Grammar rules that seek to find things that are required, are usually easy to 
code, and presennt no unusual problems. Every pattern you look for is pretty 
unique.  It is the flexibility in certain languages that create some 
headaches. A GIVING phrase, or the ROUNDED keyword, are optional.

With these grammar tools, one way to indicate that something is optional is 
to set up competing rules; one of which is simply empty. 

The parser tool actually figure out which rule to 'discover' by examining the 
next token (called the lookahead).  If I had two rules

lhs_1 : ADD ref_data TO ref_data

lhs_2 : ADD ref_data TO ref_data ROUNDED

and the tool had seen the leading portion (lhs : ADD ref_data TO ref_data)

If the next token, the lookahead, is ROUNDED, the tool can figure out what to 
do for us.
But if I express the ROUNDED thing as a subrule, and one of its alternatives 
is an 'empty' possibility; I might have trouble, if another rule had an 
epsilon in that same position. For example
 
lhs_3 : ADD ref_data TO ref_data ON SIZE ERROR

If the ON SIZE ERROR clause, which is optional, is expressed in a subrule 
that has an empty alternative, we have the following actual problem

lhs_1 : ADD ref_data TO ref_data
lhs_2 : ADD ref_data TO ref_data (epsilon)
lhs_3 : ADD ref_data TO ref_data (epsilon)

here I have used the tag 'epsilon' which comes from the gurus who discourse 
about this in the learned texts.  Epsilon just represents the empty rule 
alternative of sub rules that are optional.

The grammar tool can not distinguish epsilon from epsilon, and it is smart 
enough to realize that epsilon is equivalent to the nothingness at the end of 
lhs_1.  When the parser see
"ADD ref_data TO ref_data" it does not know which of the three rules to 
'discover'.

So a solution is to elevate some part of the lower rule that will harden the 
ends.

lhs_1 : ADD ref_data TO ref_data
lhs_2 : ADD ref_data TO ref_data ROUNDED
lhs_3 : ADD ref_data TO ref_data ON fragment-subrule-size-error

That give the tool something to compare the lookahead to. (my example is not 
perfect).

The optionality of ROUNDED is actually here represented by two alternative 
rules, on with the extra keyword, one without. Sometime we are motivated to 
force one rule to yield to the other 
on the token: for example

lhs_1 : ADD ref_data TO ref_data
           %prec YIELD_SHIFT_ON_LOOKAHEAD_ROUNDED
lhs_2 : ADD ref_data TO ref_data ROUNDED
lhs_3 : ADD ref_data TO ref_data ON fragment-subrule-size-error

(in bison  %prec YIELD_SHIFT_ON_LOOKAHEAD_ROUNDED is dummy token that 
achieves a reduction or a shift for us).

The posted rules attempted to displace ambiguous rules by avoiding the very 
natural tendency to put optional phrases in subrules that have an empty 
alternative. Yet you really do not want to put the whole continuation up in 
the higher rule.

So I make a segment of an OFIN phrase unambiguous by specifying
   lhs :  ref_data
versus
   lhs : ref_data OF

This works well (it requires a tidy up phase at the end to make certain that 
we do not 
end on OF -- which actually we must be robust enough to trap anyway; so here, 
though, artificially established nothing new comes about in the rules_

(Another post, does the same thing with the FOR key word, "unnaturally" on 
the right edge of 
competing rules).

Hope that helps some.

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com


--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.