[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
gnubol: Re: Magic tokens

To: gnu-cobol@lusars.net
Subject: gnubol: Re: Magic tokens
From: RKRayhawk@aol.com
Date: Fri, 3 Dec 1999 02:53:23 EST
Delivered-To: gnu-cobol-outgoing@wallace.lusars.net
Reply-To: gnu-cobol@lusars.net
Sender: owner-gnu-cobol@wallace.lusars.net
Tim Josling TIMJOSLING@prodigy.net 

Has made a bison grammar available. It definitely reveals a lot of hard work.

Thankyou.

It will take time to study and completely understand the work you did, so 
don't be offended if the following comment is not completely informed. But I 
do want to let you know that some of the forward scans you are doing are not 
necessary. 'Though I do not fully understand how you do the forwards scan I 
am real confident that you don't need to and I think I know why that is 
happening to you.

The table driven goal oriented tools only have one state at a time but that 
state can be in the middle of a lot of possible competing rules. For example,


pr_add_statement_imperative:
   tk_add  pr_from_numeric  tk_to  pr_to_numeric
 | tk_add  pr_from_numeric  tk_to  pr_to_numeric GIVING pr_giving_numeric

Involves a situation where competing rules are concurrent up to the point 
that GIVING becomes the lookahead.  It is perfectly okay that these rules 
compete, You seem to think that it is necessary to scan ahead and temporarily 
look for GIVING, and when you find it, magically push the tk_add token back 
on the stack as tk_add_M_M_has_giving_M 
and have your rules  look something  like 

pr_add_statement_imperative:
   tk_add  pr_from_numeric  tk_to  pr_to_numeric
 | tk_add_M_M_has_giving_M  pr_from_numeric  tk_to  pr_to_numeric 
pr_giving_numeric

That is not necessary. Within the ..._imperative rule, bison will allow the 
leading portion of the
various RHSs to look alike with no problem.  You may subsequently have 
trouble merging this rule into something that can have two different path 
ways leading to the point where the rule pulls in this ..._imperative. That 
will look like an r/r conflict. The problem is not down here in this multi 
branched ..._imperative.  The problem is in the plan of attack up above. An 
r/r conflict simply means two rules are trying to reduce to two continuations.

Turns out that you can have as many concurrent running competitive rules as 
you like. So it is not harmful to have two continuations from any given 
place. But at any given juncture you can not have two rules reduce 
simultaneously and then try to continue onto multiple paths.

(Conversely, you are allowed to have jillions of things end as long as there 
is only one new continuation).

From looking at this lookahead for GIVING and the magic transformation of the 
lead ADD token, I would guess that you had a rule which included the 
..._imperative. And there were two or more ways into it from rules reducing 
at that point : and from there either an ADD continue or an alternate ADD 
continue looks like an r/r.  The problem is not in the two ADD continuations. 
 The problem is either in the collection of things coming into that point, or 
in unacceptable precedence on the r/r default.

The point in these situations is that bison is not going to have trouble 
continuing. It would put all ADD statement rules concurrently into scope, 
piece of cake until it sees contrary lookahead subsequent to that.

The problem in r/r is that bison is saying it does not know which of multiple 
preceeding rules to reduce when it sees ADD as the continuation. This 
generally means it is time to change the other rules not the ADD imperative 
collection.

Your %tokens are defined in alphabetic order. That is part of the problem. 
Some things that come before an add you want to reduce, some you want to 
shift.  If the preceding precedence is less than ADD the ADD shifts on.  So 
for example make ON, SIZE, ERROR  have lower precedence than ADD, and it will 
glue onto (shift onto) that clause.

If you define ADD and SUBTRACT as equal then to get a preceding 
(unconditional) ADD to reduce when it sees SUBTRACT, you will need an 
artificial token such as
%left GREATER_THAN_ARITH_VERB  and a %prec marker. (Unless you wish to try to 
use associativity which really will not be viable in this situation.  Also 
one of your comments seems to suggest that you do not realize that 
associtivity only kicks in when precedence is equal. So,..., I mean this in a 
real friendly manner, ... using %left or %right in a bison grammr where every 
token has a distinct precedence, as in an alphabetical ordering, is a non 
sequitur).

To use the  dummy tokens for shift/reduce explication you can code, for 
example,

your_rule_imper :
   ADD simplicity_here %prec GREATER_THAN_ARITH_VERB

will reduce when it encounters SUBTRACT or even ADD.

Trying to establish the precedence of ON might be impossible, and surely we 
will find
contradictory requirements for NOT, so rather than try to get the conditional 
clauses to 
allow shifting of verbs we could set up artificial precedence tokens for them 
as well


your_rule_conditional :
   ADD complexity_here %prec LESS_THAN_ARITH_VERB

will shift when it encounters an arithmetic verb as lookahead.

Now you can recurse with your original rule

your_original_rule :
   ADD simplicity_here %prec GREATER_THAN_ARITH_VERB
   ADD complexity_here %prec LESS_THAN_ARITH_VERB


Any two ADD statements can preceed any following ADD statement this way and 
there is
no conflict.

Your reorganized tokens now _might_ be.

%nonassoc LESS_THAN_ARITH_VERB
%nonassoc ADD SUBTRACT DIVIDE MULTIPLY COMPUTE
%nonassoc GREATER_THAN_ARITH_VERB

It is the ordinal relationship of the dummy tokens to the lookahead. When the 
dummy token is less than the lookahead token, the lookahead shifts on.  When 
the dummy token is greater than the lookahead, the token precipitates a 
reduction of the preceding rule.

You can, of course generalize this.  I use dummies that have similar 
oridinality, but the different names help me clearly describe in the rules 
what I am doing

%nonassoc LESS_THAN_CONDITIONAL
%nonassoc SIZE ERROR
%nonassoc GREATER_THAN_CONDITIONAL

%nonassoc LESS_THAN_VERB
%nonassoc ADD SUBTRACT DIVIDE MULTIPLY COMPUTE
%nonassoc PERFORM MOVE
%nonassoc GREATER_THAN_VERB

here
  %nonassoc GREATER_THAN_CONDITIONAL
and
  %nonassoc LESS_THAN_VERB

are the same as long as I do not insert any tokens inbetween, so although it 
is less readable,
I could code the dummies as

%left GREATER_THAN_CONDITIONAL  LESS_THAN_VERB

But the whole matter takes some work, and I might want to move a group around 
in the order to create different relationships, so I keep them separate.  

Into this we need to weave the END-XYZs and IF/ELSE etc.

A comment in your bison script indicates you have seen the conflict reports 
about the classic dangling ELSE clause.  Again precedence can fix that, and 
you will not have to cross your fingers to hope that it keeps working.

Blocks of code in [THEN] clauses can be tagged with a dummy which is greater 
than ELSE.

so if your token sequence was

...
%nonassoc LESS_THAN_ELSE
...
%token ELSE
....
%nonassoc LESS_THAN_ELSE

and your rule looked like this

somewhere_in_IFs :
   IF reasonable_cond opt_THEN big_complex_block %prec LESS_THAN_ELSE

an encounter with token ELSE as a lookahead will shift the ELSE token onto 
the structure.

You will nolong hear complaints from the tool when you use these to resolve 
conflicts.  A portion of the bison feedback will explain what is going on in 
rules that have these marks; it is actually still necessary to review those 
thoroughly. For COBOL we will not need many of these, but we will need to use 
them in many places.

Bison actually makes a choice in r/r situations, the first rule I think.  But 
that is at best a fifty-fifty. I think s/r situations default to shift (the 
idea is to parse as much as possible successfully as a default). That too is 
not reliable.

Think of rules as parts of movies on plastic film. In our grammar world many 
movies (rules) which have small subsections that are alike.  The movie plays 
by getting everything aligned where all simillar movies have their 
identically matching frames synchronized. Bison plays each movie one frame at 
a time. Stepping each movie forward as long as its next frame matches the 
lookahead. Thousands of movies can play simultaneously this way, the state of 
the machine is the current frame (the aligned frames are all the same, but 
they are in the middle of different movies).  

Eventually a lookahead frame arrives that forces a decision. Each rule that 
might continue either continues by shifting when its continuation matches the 
lookahead, or ends by being dropped completely. Some rules end there because 
that is their end point. New rules can start at that point too commencing 
with the lookahead token. If these are bunches of concurrent movies that have 
been synchronous up to this point, the basic idea is that the camera person 
can only do simple splicing. You can not reduce two or more rules and splice 
in as many as two or more rules at one point, because the camera person is 
real stupid and can not tell which of multiple splices glue to which of 
multiple previous movies. The camera person knows that can't be right. It's 
like being in college and the rules are once enrolled you can not drop two 
courses and add two curses at the same time or your major become unknonwn.

However while continuing any number of rules you can either reduce multiple 
and add one, or reduce one and add multiple; and also drop many obsolete 
rules simultaneously. When reviewing bison output your can get so disoriented 
it is unbelievable. When you focus on individual vexatious conflicts, you can 
really get the wrong impression that the state of the machine reflects just 
one rule. That is not at all true.  You can have many many many competing 
rules going on, even the same rule can compete with itself concurrently if 
different subsections match each outer (that happens alot on things like 
statements with alternate conditional clauses).  And also less obvious, at 
any give shift point you could have some rules shift and perhaps numerous 
rules die, and still not have a reduction at that point.

Anyway, just intended as tips, hope it's useful.

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com









--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.
Follow-Ups:
- Re: gnubol: Re: Magic tokens
  - From: Tim Josling <TIMJOSLING@prodigy.net>
Prev by Date: Re: gnubol: How do we parse this language, anyway?
Next by Date: Re: PERFORM token (was RE: gnubol: Hacks needed to parse COBOL
Prev by thread: Re: FW: gnubol: How do we parse this language, anyway?
Next by thread: Re: gnubol: Re: Magic tokens
Index(es):
- Date
- Thread