[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parsing nested statements: was Re: gnubol: subsets
In a message dated 11/21/99 1:20:19 PM EST, TIMJOSLING@prodigy.net writes:
<<
It gets better - the IF statement allows precisely one conditional statement
on
the end of the list of imperative statements. The language in the standard
does
not help (calling them imperative statements).
An example
IF a > b
display ...
add a to b
on size error
display ...
ELSE
display ...
subtract retro-adjust-factor from running-total
on size error
display ...
END-IF
This is an imperative statement I think.
>>
In this situation the outer statement is the IF statement, and the arithmetic
statements are the inner statements.
The outer statement is definitely an imperative statement because it has an
explicit scope terminator, the interior arithmetic statements are also
'imperative'. I believe, the reasons are not so obvious. Scope termination
can happen in several ways.
The ADD and SUBTRACT statements illustrated seem like conditional statments,
this because they lack explicit scope terminators. But in these situations we
are not dealing with the same issue as the restrictions on what _type_ of
statements might be coded on the conditional clauses of outer arithmetic or
I/O statements.
The situation is different.
In the IF statement, the [THEN], the ELSE and END-IF shift due to high
precedence or specification of associativity. The ELSE clause glues to the
current (inner most) IF. In order to do that it must be able to burn through
anything in between. The conditional clauses of arithmetic and I/O verbs are
not defined that way (trouble is they do not exactly reduce either, because
of standards committee visionaries the 'look-ahead' for the scope terminator
leads to alternate reduce or shift requirements).
The (optional THEN and) ELSE in IF statements, and the WHEN condition-name,
and WHEN OTHER in EVALUATES, can be handled more directly: These clauses can
include things that look like conditional statements. As far as I understand.
And mere precedence and associtivity will get the job done. The ON SIZE ERROR
token(s) or NOT INVALID KEY, don't bother the IF or EVALUATE scanning, they
lack the strength to interrupt that code, and such interior statements,
ineffect reduce, and then shift onto the [THEN] or ELSE or WHEN clause.
The deal with the ON SIZE ERROR clauses in nested arithmetic statements is
that the scan collides with the same category of token because the token
(grouping) does not have associativity by itself (it is the presence or
absence of the much more subsequent scope terminator that imparts this
quality, which does not compute with available tools). The IF and EVALUATE
interior tokens are simple. Point blank shift.
So when a scanner sees ELSE, it does not foreshorten the inner IF statement,
but instead adds the ELSE and its following retinue to the inner IF. This is
even more true for the EVALUATE, where the WHEN cluase shift and shift and
shift, creating a large code structure.
Shifting ELSE reopens the flood gate, at that point any statement _type_ is
permitted. As far as I know you are allowed to go one level deeper, ... using
your example, ... notice now the presence of an 'imperative' inner arithmitic
statement on the conditional clause of the 'conditional' arithmetic statement
in the [THEN] clause ...
IF a > b
display ...
add a to b
on size error
ADD 1 to error-count
ELSE
....
that [THEN] clause block has what I think is in fact called an imperative ADD
statement (the add a to b).
But knowledgable coders will recognize the need for bolstering code if we
want the alternatice condition invoked (note the required explicit scope
terminator).
IF a > b
display ...
add a to b
on size error
ADD 1 to error-count
END-ADD
NOT on size error
ADD 1 to we-actually-can-add-count
ELSE
....
still, I think add a to b _is_ 'imperative'.
a super paranoid coders would probably then just do this too
IF a > b
display ...
add a to b
on size error
ADD 1 to error-count
END-ADD
NOT on size error
ADD 1 to we-actually-can-add-count
END-ADD
ELSE
....
because they know they will be called in at 2:00 in the morning to deal with
production down conditions. So far so good.
But, ... we must, IMHO, help the coder when they offer code that lacks the
necessary scope terminators, .... like
Situation One:
IF a > b
display ...
add a to b
on size error
ADD 1 to error-count
NOT on size error
ADD 1 to we-actually-can-add-count
ELSE
....
and of course
Situation Two:
IF a > b
display ...
add a to b
on size error
ADD 1 to error-count
on size error
ADD 1 to we-actually-can-add-count
ELSE
....
Situation Two can be handled by merely detecting duplicate ONE SIZE ERROR
clauses in the outer arithmetic statement (this applies to I/O verbs as
well). ((We need to do that anyway)).
Situation One represents 'valid' code, but, IMHO, since I am a radical,
deserves a serious warning message. Again with counters we can trap that. Up
in the rule for the outter aritmetic statement we would need a count of inner
arithmetic statements, and some sense of the sequence of things which
happened. Ditto for I/O verbs.
Permit me to go back to your phrasing ...
In a message dated 11/21/99 1:20:19 PM EST, TIMJOSLING@prodigy.net writes:
<<
It gets better - the IF statement allows precisely one conditional statement
on
the end of the list of imperative statements. The language in the standard
does
not help (calling them imperative statements).
>>
Actually that does help. Should the [THEN] or ELSE clause have an arithmetic
statement that has no conditional clause at all, then it is an imperative
statment. If such aritmetic statement does have a conditional clause, then
it's scope 'must be delimited'. But the ELSE _does_ delimit the aritmetic
statement contained in the [THEN] clause. That is either precedence or
associativity. That same conditional clause can contain an imperative
statement itself, which if it is arithmetic needs to have its very own
explicit scope terminator or an absence of conditional clause within.
The exact same thing applies to the I/O verbs. The code base, out there in
reality is a little different though. All of the same convolutions apply to
the conditional claues; we just have much fewer I/Os within I/Os (they are
still vitally important). But what we do have is _many_ I/Os inside of IFs
and EVALUATEs, and those I/Os _very_ frequently have one or or both of the
conditional clauses in either order, and it is all followed by ELSEs, ELSE
IFs, of subsequent WHEN clauses.
Much of that very code base has fluid transitions from old undelimitted
statements back and forth to explicitly scoped terminated statements. Much
old code also has periods, much new code does not.
This is an opinion, but strongly felt, the trend towards periodless code
makes it much more important for the compiler to remain robust as it courses
'conditional' and 'imperative' statement _types_.
I think it is useful to get full visibility on the notion that the statements
have _type_. This type is based upon inclusions/exclusions of clauses and
terminator tokens. The type determines grouping into category. The category
determines _SYNTACTIC_ validity. We will see every combination! Including all
of the kinds of things we convince ourselves are 'invalid'. We must trap them
even if we do not have the resources to make the error messages pretty in
early releases. Of course, IMHO!
Notice in these nested IF and EVALUATES, that blocks of imperatives become an
'imperative statement', :-) But notice that other arithemtic statements do
not terminate the scope of a currenly running conditional clause on an
aritmetic statement (or I/O statement), further MOVE does not, and PERFORM
does not. BUT ELSE (or a WHEN clause if we are in an EVALUATE) does terminate
ongoing scope.
Infact the ELSE terminates every open _clause_ right back out to the [THEN]
(and WHEN back out to the most previous WHEN). Yet these tokens have shift
signification.
So we need to recognize that in COBOL scope termination is a hierarchical
feature. Some things like ELSE and WHEN have a high position in the hierarchy
and they will delimit many currently open scopes. This kind of things is a
little hard to do with available tools, or it is atleast hard to sense when
you are doing the hierarchy of tokens. ELSE and WHEN should shift, but
obviously we do not want them to shift onto any open arithmetic or I/O
conditional clause. They must have higer priority (in bison that is done by
declaring the %token later).
If we have sets of rules that distinguish scope delimited from non scope
delimited, and permit the IF and WHEN rules to reference things which are not
necessarily scope dilimited, we get pretty much what we want from available
tools. Any inner statements with conditional clauses, will reduce on account
of lookahead determine by one token (ELSE/WHEN) that that inner unterminated
statement just ended. But what is hard to perceive, and maybe mostly easy to
code, is that when we ascribe high priority to these tokens they will be able
to repeatedly terminate nested conditional statements within conditonal
statements.
So what you have is that the ARITHMETIC statments clauses must have one set
of allowable statements (a grouping) and the IF/WHEN clauses must have
another set of allowable statements. These grouping, in effect, type the
statements.
The IF statement originally posted is an imperative statement, because of its
explicit scope terminator. The enclosed arithmetic statements are conditional
statements, these are permited because the ELSE token (or WHEN token) has no
potential alternate interpretation: so before it shifts the inner arithmetic
statments reduces (this is triggered by mere lookahead that determines that
ELSE/WHEN can not glue on the current statement). And in these [THEN] and
ELSE clasues any such inner conditional arithmetic statement can actually
have arithmetic statements on its conditional clause as long as those
arithmetic statements are 'imperative' on account of the absence of their own
conditional clause or the presence of an explicit scope terminator. The whole
shooting match still gets reduce when we encounter the ELSE/WHEN or END-IF or
END-EVALUATE, which token then shifts.
When our compiler is in these thickets it will need to be able to
countenance, and succesfully diagnose multiple conditional clauses; and IMHO
also, separately note the transition of an inner second conditional
("validly") to an outer arithmetic, as suspect, as having perhaps been
intended for the inner arithmetic statement (which wasn't sticky because of
the absence of an explicit scope terminator). All while deep in IF and
EVALUATES nests. Ditto for I/O as for arithmetic.
So, ..., this is not just a bunch of words. The ELSE terminates the inner
arithmetic statement. As it would an inner I/O statement. This matter is
hierarchical. ELSE is strong, it is a 'distinguished' token, as NIklaus
Wirth calls this kind of thing in "Compiler Construction", in discussing
Oberon.
So ELSE is strong enough to close out ADD and READ, etc. This is simply
accomplished with the alternate rule sets. The allowable statement type
inclusions in the [THEN] clause is broader because the ELSE will end any
interior 'conditional' statements (I am fairly sure that the nomenclature is
then further assualted; those become imperatives in the sense that there
scope is delimited - even though that would not be true on the branch of a
conditional arithmetic statement!)
Note that in this sense ELSE is strong enough to close these other things out
and trigger their reduction before the ELSE is shifted. You don't have to
understand that to code the rules, you just need the alternate rule
groupings, which you need any way to support the arithmetic statements and
I/O statements. But the theory helps us, because ...
ADD terminates a previous ADD unless it occurs in the scope of a current
conditiion branch. An I/O verb, or a MOVE or a PERFORM terminates, by
strength, a previous ADD (for example) unless it occurs in the scope of a
current conditional clause. This would actually be accomplished in the
grammar by structures of rules. But is is useful to understand that the
ELSE/EVALUATE will terminate the weaker conditional clause because it is
strong enough to do so (that will require precedence or associativity).
The IF token is not capable of terminating a currently open conditional
clause, (neither an arithemtic nor I/O). For the IF statement itself can be
viewed as yet another imperative statement that will glue within any current
conditional clause (in this regard [THEN] and arithemtic conditional clauses
are similar in their voratious appetite). ELSE and WHEN stop the hunger of
arithmetic and I/O conditionals, but not the IF.
Any ELSE/WHEN closes everything back to its antecedent. THe antecedent of
ELSE is the [THEN], but it is not strong enough to close the IF statement
itself, the IF keeps growing. IF will only grow to the second part, the ELSE
clause (which may be very complex). THE EVALUATE keeps growing and growing.
So, an ADD that has a conditional clause without an explicit scope
termination _STILL_ becomes an imperative statement when a stronger
distinguishing token terminates it. Since the ADD token is not stong enough
to terminate an ADD statement, it can not turn the first, second, third, or
any level of ADD statement conditional clause into an imperative form. The
ADD token lacks the strength to terminate ADD. ELSE and WHEN are strong
enough. Furthermore WHEN clauses also terminate previous WHEN clauses: but
the ELSE does not terminate the IF statement, and the WHEN does not terminate
the EVALUATE statement.
Arithmetic and I/O verbs are not strong enough to terminate their own or
eachothers conditional clauses. They do each have enough strength to
terminate unconditional varieties of either statement kind. So it is the
conditional clause that is strong. IF, THEN, ELSE, EVALUATE, WHEN _clauses_
are strong clause. Their delimiting tokens burn through other contexts even
if that other context is not explicitly scope delimited.
Most of that can be done with precedence and associativity, but it would be a
morass if we do not group statements by type based on their complete topology.
As programmers we tend to thing of the procedure division as imperative code
with controlled occassional conditional sub parts. Token processing, I think,
really requires that we look at it the other way aroung. The procedure
division is conditional code with zero, one or maybe more imperative blocks
embedded therein.
A special case is the conditional code that is surrounded by the
unconditional token, thin air. No joke. Unconditional code is just a special
case of conditional code. Unconditional code is a DO ALWAYS constuct. Every
block is like that.
Every branch of a conditiona clause of any type is an inline perform that
happens to have the PERFORM ... and ... END-PERFORM delimiter absent because
in this language, BEGIN and END, or { and }, are optional. Pascal
BEGIN END surround conditional and unconditional code blocks. The
unconditional is just a variation.
COBOL, in its infinite wisdom, allows two kinds of conditionals. The first
type has ordinality: these are the IF and EVALUATE statements, the parts of
which need to be in some order (translation compiler writer and source code
worker are conscious of associativity). The second type of conditional has no
ordinality. These tend to come in pairs hanging on some head pattern. The
lack of ordinality completely defeats any use of associativity or token
precedence to figure out intentions in nested conditions.
IMHO, the lack of ordinality is intentional, it frees the source code worker
of concerns about associativity. Some view this as part of the politics. I
don't I think that their is a brilliance to that. The issue is the available
work force. And their is just a guess as to what is more statistically
probable; a coder who wants to transpose conditional clauses, or a worker who
want to nest similar category statements. Obviously there is a lot of need
to nest IFs. Arithmetic statements are another matter, and more so I/O.
Making the conditional clauses transposable makes the source code worker more
productive, because there are fewer recompiles (since they do not have to
reorder those clauses). And nesting is allowed, with restrictions. They can
have their cake and eat it too.
For that difference to be possible the ELSE and WHEN clauses must burn like
acid throught _all_ current conditional branches. That brings about atleast
a small hierarchy in tokens. Mostly just lookahead gets this done, but
precedence and/or associativity will be involved. But most importantly, there
is no reason at all that this distinguished token cannot terminate an
erstwhile 'conditional' inner statement and its included 'imperative'
statement and thereby turn the whole preceding block into an 'imperative'
statement block. There is no confusion becasue ELSE does not look like ON
SIZE ERROR, or NOT ON INVALID key.
Now on the other hand, if someone proposes a NOT ELSE clause, we might have
trouble.
Notice on this that the ELSE (WHEN), though it terminates all currently open
conditional branches, it does not provide an _explicit_ scope terminator for
the inner arithmetic statement on the conditional clause of an outer
arithemtic statement (in say the [THEN] clause). To be 'imperative'
statements in the block under the [THEN] clause any outer arithmetic statment
must merely be delimitted (it can infact still have a conditional clause and
lack an explicit scope terminator), but any inner arithmetic statement must,
if it is to have a conditional clause, have an explicit scope terminator in
order to become an imperative statement. But even if the inner statement does
not become imperative, its containing outer statement can become imperative
by delimiters that imply the termination of the outer arithmetic statement
such as the high precedence ELSE (WHEN) token.
Modern coders tend to go out of their way in new code to prevent any
ambiguities. And in my experience that ethic has risen up the ladder;
managers and code reviewers are much more accomodating when the code has
copious explicit delimiters. But that is not the only thing out there. There
is an awesome amount of legacy code which is a mixed bag. We will need to
keep the compiler on its feet in some tough situations.
Although it happens as a transparency when using some tools, we will be
warranted in buring through lower scopes when we encounter ELSE and WHEN, we
don't need anything else to disambiguate it.
Best Wishes
Bob Rayhawk
RKRayhawk@aol.com
Best Wishes
Bob Rayhawk
RKRayhawk@aol.com
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.