[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnubol: How do we parse this language, anyway?

To: gnu-cobol@lusars.net
Subject: Re: gnubol: How do we parse this language, anyway?
From: RKRayhawk@aol.com
Date: Fri, 3 Dec 1999 00:43:15 EST
Delivered-To: gnu-cobol-outgoing@wallace.lusars.net
Reply-To: gnu-cobol@lusars.net
Sender: owner-gnu-cobol@wallace.lusars.net

At the risk of getting people to hate me I will interact a little more on 
this conditional clause statement issue.  I value your good will , so if you 
are sick of this subject  please go to next message.


n a message dated 12/2/99 6:27:13 PM EST, wmklein@ix.netcom.com writes:

<< You are absolute correct that the current Standard absolutely positively
 requires that the NOT ON SIZE ERROR *must* match the most recent arithmetic
 verb.  I believe that if you asked J4 (or WG4), they would indicate that what
 you have code is a NON CONFORMING implementation and not an extension.
 
 Take a modification of your example and tell me what you do with:
 
 If A = B
   add a to b
      size error
           add c to d
       not size error
           display ...
         End-Add
 Else
    Display "other"
 End-IF
 
 In this case the source code IS conforming (as an IF can have a conditional
 statement) and the END-IF *must* terminate the 2nd ADD - and the NOT SIZE
 absolutely, posively MUST match the 2nd (not 1st) ADD.
 
 Bill Klein
   wmklein <at> ix.netcom.com
  >>

Bill Klein is a very intelligent person and most generous with his post in a 
number of newsgroups. We would be benefitted by his contributions here as 
anywhere else in the gnubol project.

In your example, the NOT ON SIZE ERROR binds to the inner arithmetic because 
it is an explicitily scope-terminated 'imperative' (That is, in some vendor's 
parlance a conditional becomes an imperative when its ends are hardened with 
the explicit scope terminator).

There is no flyout.

When the example reverts back to previous discussions by tearing out the 
explicit scope terminator, as in 

If A = B
   add a to b
      size error
           add c to d
       not size error
           display ...
 Else
    Display "other"
 End-IF
 
The NOT ON SIZE ERROR definitely flies out!  I know people hate me. I am most 
appologetic to be the bearer of these tidings. There is nothing special about 
this situation. It is much easier to see when you conceive of reducing the 
internal arithmetic as a simple imperative arithmetic.  It gets a lot easier 
then to believe, and see that I am not being a pain in the butt. The standard 
does say that that NOT ON SIZE ERROR clause attaches to the _previous_ 
conditional arithmetic which does not attach.

It is not trivial to find the antecedent. The notion of finding the 
antecedent is not invented by one vendor, it is in the standard.  When the 
internal arithmetic is not explicitly scope terminated, the condition clause 
must attach back to a previous when the previous does not have such a clause. 
 The standard does not state a requirement of what to do when no previous 
arithmetic has an available unoccupied alternate. You can flag the interior 
as wrongly conditional, or you can tag some exterior has having excessive 
NOT/ON clauses, and take your pick of the next previous or the highest level. 
 

Every one agrees the latter situation is an error, but the flyout is real and 
neither abstract, unworldly or even inventive of the vendor.

Tim Josling was on the right path when he commented to the effect that the 
point is to preserve legacy code.  Although of late he seems not in my camp, 
which camp is lonely indeed.  Imagine old code

If A = B
   add a to b
      size error
           add c to d

Now add the new fangled NOT ON SIZE ERROR clause as the language evolved, but 
do not damage the old code.  

If A = B
   add a to b
      size error
           add c to d
      NOT ON SIZE ERROR
           display 'during temporary testing proof add a to b is cool

* distinguished token or EOS as implicit scope terminator follows
 .
* (in the orginal example ELSE would be sufficiently distinguished.

Our real issue is convergence then. I am not after my point of view.  If 
there are compilers that flag this later example as a wrongful interior 
conditional, then if they do 1) accept it anyway or 2) refuse to gen code, 
then this behavior is different from what mainframe compilers are doing.  The 
owners of the code base are the ones we need to be concerned about. It is my 
asumption that we want as much of that code base as possible.

If we want code to converge on UNIX platforms, then we ought to be aware of 
the behavior difference. And then make a choice, consciously. Or contemplate 
a larger project than any of us would want. 

The mainframe compiler behavior does have a basis in the standard.  If the 
standard had any intention of linking the clause to the interior, it would 
not state that that is illegal and then contain verbage about the "previous' 
to which the clause must link. 

This is really a design issue.  The top of the parse rules must be able to 
embed a number of different kind of conditionals with on going concurrent 
scope.

This has to be done by lookahead (it does not require infinite lookahead). It 
requires a plan for the parser that contemplates _MANY_ concurrently scoped 
conditional clauses (ON SIZE ERRORs inside of INVALID KEY inside of really no 
basic limit). 

We must understand from the outset that certain concurrent (competing) scopes 
will obsolesce upon the absence of certain lookaheads, and others survive or 
indeed reduce at that point.  It will have to be very recursive and we must 
have a means to have the scope of an outer arithmetic survive even when an 
inner arithmetic reduces with an END-arithmetic explicit scope terminator 
that is not quite yet the right one for the outer. 

I have experimented with this with bison and can tell you it is hard to get 
the branches to recurse, and nearly any success at having fairly ambiguous 
rules remain concurrently competing definitely tends to produce a situation 
in which the outer scope wants to end on the inner scope's explicit scope 
terminator when the two would be the same END-arith.

But these games with the arithmetics are only the beginning. All of the 
conditionals can embed within one another. The arithmetics within arithmetics 
is just a special case. Just as conditional I/O's within conditional I/Os.  
Some examples seem so ridiculous that you would wonder why anyone might even 
code them at all. But that misses the point.  To be very honest the 
arithmetics within arithmetics only illustrates the much larger problem of 
conditionals within conditionals, and the need for 'flyouts' when certain 
interior statements can not be considered the bind points as they are not 
allowed to be unterminated conditionals.

The point to bind things back to can be a long way off in contrived 
arithmetic nestings, and in real world complexly populated conditional 
nestings. Success will come form concurrent competing scopes which at any 
given moment can reduce the interior statements as simple and allow the outer 
surving scope to continue to hold onto the conditional that was first 
encountered in a position subsequent to the interior simple statement.

This is definitely design discussion.  The procedure division is a 
conditional. In fact the procedure division is optional. Conditional 
statements can have a header and possibly several attached (bound) clauses.  
A simple imperative is a special case conditional that is always executed.

Along the branches of conditional clauses we can have blocks compased of one 
or more other statements which  are conditional (frequently they are of the 
special case variety 'unconditional).   These things can all nest. The 
procedure division is highly recursive

The basic objective of syntax is to stay on its feet and continue to 
understand the recursive surface of embedded conditionals as it picks off 
structurally significant sequences, token by token. 

Individual sequences of tokens are important, but it is the recursive surface 
that is most important.  Parsed clauses may belong to a header a long way 
back.

Because of the history of the evolution of COBOL we will undoubtedly need 
competing rules to be concurrent, some of which are trying to bind parsed 
statements to a nearby scope header, and some trying to bind parsed 
statements to a more distant ancestor.

Even though the conditional statements have available scope terminators there 
is not a hierarchical relationship between them.  An END-ADD is not 
inherently a distinguished token capable of implicitly terminating a 
conditional READ, until such a READ is stated inside of the conditional 
branch of an ADD statement. And vice-versa.

The delimitting breadth of effect of scope terminating tokens is determined 
entirely by a given current nested structure. In COBOL scope termination is 
highly context dependent.
That requires heavy concurrency in rule scopes. This issue is introduced to 
people by looking beyond mere nested IFs.  The concurrent scope problem is 
massive, and the flyback of the conditional clauses that occur after simple 
interior arithmetics is just one small part. The problem is major.

This is not meant to be negative, but really I am sure that the fact that we 
are having this discussion at all means that we have started with the token 
list and attempted to build up from there.  That won't work. We need to start 
from the top. The nested IFs are interesting but the other conditionals are 
more challenging.  Try nesting a few combinations of different kinds of 
conditionals within conditionals, you will find the flyback issue is already 
there before you get to the special case of the simple aritmetic that has 
visions of grandeur.

This problem is much bigger than mere flyouts. We need a system of recursion 
that deals with explicit and implicit scope termination which itself requires 
concurrency of a very ambiguous alternates.  The flyouts are really just 
attachments to one or more of the competing rules. All these things must be 
healthy even when slung onto the branches of the more traditional IF and 
EVALUATE recurses.

When you try it from the top down, you will see the embedding problem, and 
the need to be able to reduce simple arithmetics in many places.  Getting 
them to reduce on certain branches of outer arithmetics is not more difficult 
then getting them to reduce on the branches of conditional clauses of other 
outer statements. Getting subsequent conditional clauses to reduce back 
requires concurrency. Once you see how challenging that concurrency is, the 
flyback begins to look like the mere introduction.

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com

--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.
Follow-Ups:
- RE: gnubol: How do we parse this language, anyway?
  - From: "William M. Klein" <wmklein@ix.netcom.com>
Prev by Date: Re: gnubol: refmod again (fwd)
Next by Date: gnubol: Re: Magic tokens
Prev by thread: RE: gnubol: How do we parse this language, anyway?
Next by thread: RE: gnubol: How do we parse this language, anyway?
Index(es):
- Date
- Thread