[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnubol: How do we parse this language, anyway?
At the risk of getting people to hate me I will interact a little more on
this conditional clause statement issue. I value your good will , so if you
are sick of this subject please go to next message.
n a message dated 12/2/99 6:27:13 PM EST, wmklein@ix.netcom.com writes:
<< You are absolute correct that the current Standard absolutely positively
requires that the NOT ON SIZE ERROR *must* match the most recent arithmetic
verb. I believe that if you asked J4 (or WG4), they would indicate that what
you have code is a NON CONFORMING implementation and not an extension.
Take a modification of your example and tell me what you do with:
If A = B
add a to b
size error
add c to d
not size error
display ...
End-Add
Else
Display "other"
End-IF
In this case the source code IS conforming (as an IF can have a conditional
statement) and the END-IF *must* terminate the 2nd ADD - and the NOT SIZE
absolutely, posively MUST match the 2nd (not 1st) ADD.
Bill Klein
wmklein <at> ix.netcom.com
>>
Bill Klein is a very intelligent person and most generous with his post in a
number of newsgroups. We would be benefitted by his contributions here as
anywhere else in the gnubol project.
In your example, the NOT ON SIZE ERROR binds to the inner arithmetic because
it is an explicitily scope-terminated 'imperative' (That is, in some vendor's
parlance a conditional becomes an imperative when its ends are hardened with
the explicit scope terminator).
There is no flyout.
When the example reverts back to previous discussions by tearing out the
explicit scope terminator, as in
If A = B
add a to b
size error
add c to d
not size error
display ...
Else
Display "other"
End-IF
The NOT ON SIZE ERROR definitely flies out! I know people hate me. I am most
appologetic to be the bearer of these tidings. There is nothing special about
this situation. It is much easier to see when you conceive of reducing the
internal arithmetic as a simple imperative arithmetic. It gets a lot easier
then to believe, and see that I am not being a pain in the butt. The standard
does say that that NOT ON SIZE ERROR clause attaches to the _previous_
conditional arithmetic which does not attach.
It is not trivial to find the antecedent. The notion of finding the
antecedent is not invented by one vendor, it is in the standard. When the
internal arithmetic is not explicitly scope terminated, the condition clause
must attach back to a previous when the previous does not have such a clause.
The standard does not state a requirement of what to do when no previous
arithmetic has an available unoccupied alternate. You can flag the interior
as wrongly conditional, or you can tag some exterior has having excessive
NOT/ON clauses, and take your pick of the next previous or the highest level.
Every one agrees the latter situation is an error, but the flyout is real and
neither abstract, unworldly or even inventive of the vendor.
Tim Josling was on the right path when he commented to the effect that the
point is to preserve legacy code. Although of late he seems not in my camp,
which camp is lonely indeed. Imagine old code
If A = B
add a to b
size error
add c to d
Now add the new fangled NOT ON SIZE ERROR clause as the language evolved, but
do not damage the old code.
If A = B
add a to b
size error
add c to d
NOT ON SIZE ERROR
display 'during temporary testing proof add a to b is cool
* distinguished token or EOS as implicit scope terminator follows
.
* (in the orginal example ELSE would be sufficiently distinguished.
Our real issue is convergence then. I am not after my point of view. If
there are compilers that flag this later example as a wrongful interior
conditional, then if they do 1) accept it anyway or 2) refuse to gen code,
then this behavior is different from what mainframe compilers are doing. The
owners of the code base are the ones we need to be concerned about. It is my
asumption that we want as much of that code base as possible.
If we want code to converge on UNIX platforms, then we ought to be aware of
the behavior difference. And then make a choice, consciously. Or contemplate
a larger project than any of us would want.
The mainframe compiler behavior does have a basis in the standard. If the
standard had any intention of linking the clause to the interior, it would
not state that that is illegal and then contain verbage about the "previous'
to which the clause must link.
This is really a design issue. The top of the parse rules must be able to
embed a number of different kind of conditionals with on going concurrent
scope.
This has to be done by lookahead (it does not require infinite lookahead). It
requires a plan for the parser that contemplates _MANY_ concurrently scoped
conditional clauses (ON SIZE ERRORs inside of INVALID KEY inside of really no
basic limit).
We must understand from the outset that certain concurrent (competing) scopes
will obsolesce upon the absence of certain lookaheads, and others survive or
indeed reduce at that point. It will have to be very recursive and we must
have a means to have the scope of an outer arithmetic survive even when an
inner arithmetic reduces with an END-arithmetic explicit scope terminator
that is not quite yet the right one for the outer.
I have experimented with this with bison and can tell you it is hard to get
the branches to recurse, and nearly any success at having fairly ambiguous
rules remain concurrently competing definitely tends to produce a situation
in which the outer scope wants to end on the inner scope's explicit scope
terminator when the two would be the same END-arith.
But these games with the arithmetics are only the beginning. All of the
conditionals can embed within one another. The arithmetics within arithmetics
is just a special case. Just as conditional I/O's within conditional I/Os.
Some examples seem so ridiculous that you would wonder why anyone might even
code them at all. But that misses the point. To be very honest the
arithmetics within arithmetics only illustrates the much larger problem of
conditionals within conditionals, and the need for 'flyouts' when certain
interior statements can not be considered the bind points as they are not
allowed to be unterminated conditionals.
The point to bind things back to can be a long way off in contrived
arithmetic nestings, and in real world complexly populated conditional
nestings. Success will come form concurrent competing scopes which at any
given moment can reduce the interior statements as simple and allow the outer
surving scope to continue to hold onto the conditional that was first
encountered in a position subsequent to the interior simple statement.
This is definitely design discussion. The procedure division is a
conditional. In fact the procedure division is optional. Conditional
statements can have a header and possibly several attached (bound) clauses.
A simple imperative is a special case conditional that is always executed.
Along the branches of conditional clauses we can have blocks compased of one
or more other statements which are conditional (frequently they are of the
special case variety 'unconditional). These things can all nest. The
procedure division is highly recursive
The basic objective of syntax is to stay on its feet and continue to
understand the recursive surface of embedded conditionals as it picks off
structurally significant sequences, token by token.
Individual sequences of tokens are important, but it is the recursive surface
that is most important. Parsed clauses may belong to a header a long way
back.
Because of the history of the evolution of COBOL we will undoubtedly need
competing rules to be concurrent, some of which are trying to bind parsed
statements to a nearby scope header, and some trying to bind parsed
statements to a more distant ancestor.
Even though the conditional statements have available scope terminators there
is not a hierarchical relationship between them. An END-ADD is not
inherently a distinguished token capable of implicitly terminating a
conditional READ, until such a READ is stated inside of the conditional
branch of an ADD statement. And vice-versa.
The delimitting breadth of effect of scope terminating tokens is determined
entirely by a given current nested structure. In COBOL scope termination is
highly context dependent.
That requires heavy concurrency in rule scopes. This issue is introduced to
people by looking beyond mere nested IFs. The concurrent scope problem is
massive, and the flyback of the conditional clauses that occur after simple
interior arithmetics is just one small part. The problem is major.
This is not meant to be negative, but really I am sure that the fact that we
are having this discussion at all means that we have started with the token
list and attempted to build up from there. That won't work. We need to start
from the top. The nested IFs are interesting but the other conditionals are
more challenging. Try nesting a few combinations of different kinds of
conditionals within conditionals, you will find the flyback issue is already
there before you get to the special case of the simple aritmetic that has
visions of grandeur.
This problem is much bigger than mere flyouts. We need a system of recursion
that deals with explicit and implicit scope termination which itself requires
concurrency of a very ambiguous alternates. The flyouts are really just
attachments to one or more of the competing rules. All these things must be
healthy even when slung onto the branches of the more traditional IF and
EVALUATE recurses.
When you try it from the top down, you will see the embedding problem, and
the need to be able to reduce simple arithmetics in many places. Getting
them to reduce on certain branches of outer arithmetics is not more difficult
then getting them to reduce on the branches of conditional clauses of other
outer statements. Getting subsequent conditional clauses to reduce back
requires concurrency. Once you see how challenging that concurrency is, the
flyback begins to look like the mere introduction.
Best Wishes
Bob Rayhawk
RKRayhawk@aol.com
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.