[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing nested statements: was Re: gnubol: subsets



In a message dated 11/21/99 1:20:19 PM EST, TIMJOSLING@prodigy.net writes:

<< 
 It gets better - the IF statement allows precisely one conditional statement 
on
 the end of the list of imperative statements. The language in the standard 
does
 not help (calling them imperative statements).
 
 An example
 
 IF a > b
     display ...
     add a to b
            on size error
                    display ...
 ELSE
     display ...
      subtract retro-adjust-factor from running-total
              on size error
                   display ...
 END-IF
 
 This is an imperative statement I think.
  >>

In this situation the outer statement is the IF statement, and the arithmetic 
statements are the inner statements.

The outer statement is definitely an imperative statement because it has an 
explicit scope terminator, the interior arithmetic statements are also 
'imperative'. I believe, the reasons are not so obvious.  Scope termination 
can happen in several ways.

The ADD and SUBTRACT statements illustrated seem like conditional statments, 
this because they lack explicit scope terminators. But in these situations we 
are not dealing with the same issue as the restrictions on what _type_ of 
statements might be coded on the conditional clauses of outer arithmetic or 
I/O statements.

The situation is different.

In the IF statement, the [THEN], the ELSE and END-IF shift due to high 
precedence or specification of associativity. The ELSE clause glues to the 
current (inner most) IF. In order to do that it must be able to burn through 
anything in between. The conditional clauses of arithmetic and I/O verbs are 
not defined that way (trouble is they do not exactly reduce either, because 
of standards committee visionaries the 'look-ahead' for the  scope terminator 
leads to alternate reduce or shift requirements). 

The (optional THEN and)  ELSE in IF statements, and the WHEN condition-name, 
and WHEN OTHER in EVALUATES, can be handled more directly: These clauses can 
include things that look like conditional statements. As far as I understand. 
And mere precedence and associtivity will get the job done. The ON SIZE ERROR 
token(s) or NOT INVALID KEY, don't bother the IF or EVALUATE scanning, they 
lack the strength to interrupt that code, and such interior statements, 
ineffect reduce, and then shift onto the [THEN] or ELSE or WHEN clause.

The deal with the ON SIZE ERROR clauses in nested arithmetic statements is 
that the scan collides with the same category of token because the token 
(grouping) does not have associativity by itself (it is the presence or 
absence of the much more subsequent scope terminator that imparts this 
quality, which does not compute with available tools). The IF and EVALUATE 
interior tokens are simple. Point blank shift.

So when a scanner sees ELSE, it does not foreshorten the inner IF statement, 
but instead adds the ELSE and its following retinue to the inner IF. This is 
even more true for the EVALUATE, where the WHEN cluase shift and shift and 
shift, creating a large code structure.

Shifting ELSE reopens the flood gate, at that point any statement _type_ is 
permitted. As far as I know you are allowed to go one level deeper, ... using 
your example, ... notice now the presence of an 'imperative' inner arithmitic 
statement on the conditional clause of the 'conditional' arithmetic statement 
in the [THEN] clause ...

 IF a > b
     display ...
     add a to b
            on size error
                  ADD 1 to error-count
 ELSE
 ....

that [THEN] clause block has what I think is in fact called an imperative ADD 
statement (the add a to b).

But knowledgable coders will recognize the need for bolstering code if we 
want the alternatice condition invoked (note the required explicit scope 
terminator).


 IF a > b
     display ...
     add a to b
            on size error
                  ADD 1 to error-count
                  END-ADD
            NOT on size error
                  ADD 1 to we-actually-can-add-count
  ELSE
 ....

still, I think add a to b _is_ 'imperative'.

a super paranoid coders would probably then just do this too


 IF a > b
     display ...
     add a to b
            on size error
                  ADD 1 to error-count
                  END-ADD
            NOT on size error
                  ADD 1 to we-actually-can-add-count
                  END-ADD
  ELSE
 ....

because they know they will be called in at 2:00 in the morning to deal with 
production down conditions. So far so good.

But, ... we must, IMHO, help the coder when they offer code that lacks the 
necessary scope terminators, .... like


Situation One:

 IF a > b
     display ...
     add a to b
            on size error
                  ADD 1 to error-count
            NOT on size error
                  ADD 1 to we-actually-can-add-count
  ELSE
 ....


and of course

Situation Two:

 IF a > b
     display ...
     add a to b
            on size error
                  ADD 1 to error-count
            on size error
                  ADD 1 to we-actually-can-add-count
  ELSE
 ....

Situation Two can be handled by merely detecting duplicate ONE SIZE ERROR 
clauses in the outer arithmetic statement (this applies to I/O verbs as 
well).  ((We need to do that anyway)).

Situation One represents 'valid' code, but, IMHO, since I am a radical, 
deserves a serious warning message. Again with counters we can trap that. Up 
in the rule for the outter aritmetic statement we would need a count of inner 
arithmetic statements, and some sense of the sequence of things which 
happened. Ditto for I/O verbs.

Permit me to go back to your phrasing ...
In a message dated 11/21/99 1:20:19 PM EST, TIMJOSLING@prodigy.net writes:

<< 
 It gets better - the IF statement allows precisely one conditional statement 
on
 the end of the list of imperative statements. The language in the standard 
does
 not help (calling them imperative statements).
>>

Actually that does help.  Should the [THEN] or ELSE clause have an arithmetic 
statement that has no conditional clause at all, then it is an imperative 
statment. If such aritmetic statement does have a conditional clause, then  
it's scope 'must be delimited'. But the ELSE _does_ delimit the aritmetic 
statement contained in the [THEN] clause. That is either precedence or 
associativity.  That same conditional clause can contain an imperative 
statement itself, which if it is arithmetic needs to have its very own 
explicit scope terminator or an absence of conditional clause within.

The exact same thing applies to the I/O verbs. The code base, out there in 
reality is a little different though. All of the same convolutions apply to 
the conditional claues; we just have much fewer I/Os within I/Os (they are 
still vitally important). But what we do have is _many_ I/Os inside of IFs 
and EVALUATEs, and those I/Os _very_ frequently have one or or both of the 
conditional clauses in either order, and it is all followed by ELSEs, ELSE 
IFs, of subsequent WHEN clauses.   

Much of that very code base has fluid transitions from old undelimitted 
statements back and forth to explicitly scoped terminated statements. Much 
old code also has periods, much new code does not.

This is an opinion, but strongly felt, the trend towards periodless code 
makes it much more important for the compiler to remain robust as it courses 
'conditional' and 'imperative' statement _types_.

I think it is useful to get full visibility on the notion that the statements 
have _type_. This type is based upon inclusions/exclusions of clauses and 
terminator tokens. The type determines grouping into category. The category 
determines _SYNTACTIC_ validity. We will see every combination! Including all 
of the kinds of things we convince ourselves are 'invalid'. We must trap them 
even if we do not have the resources to make the error messages pretty in 
early releases.  Of course, IMHO!


Notice in these nested IF and EVALUATES, that blocks of imperatives become an 
'imperative statement', :-) But notice that other arithemtic statements do 
not terminate the scope of a currenly running conditional clause on an 
aritmetic statement (or I/O statement), further MOVE does not, and PERFORM 
does not. BUT ELSE (or a WHEN clause if we are in an EVALUATE) does terminate 
ongoing scope.

Infact the ELSE terminates every open _clause_ right back out to the [THEN]
(and WHEN back out to the most previous WHEN). Yet these tokens have shift 
signification.  

So we need to recognize that in COBOL scope termination is a hierarchical 
feature. Some things like ELSE and WHEN have a high position in the hierarchy 
and they will delimit many currently open scopes. This kind of things is a 
little hard to do with available tools, or it is atleast hard to sense when 
you are doing the hierarchy of tokens. ELSE and WHEN should shift, but 
obviously we do not want them to shift onto any open arithmetic or I/O 
conditional clause. They must have higer priority (in bison that is done by 
declaring the %token later).

If we have sets of rules that distinguish scope delimited from non scope 
delimited, and permit the IF and WHEN rules to reference things which are not 
necessarily scope dilimited, we get pretty much what we want from available 
tools. Any inner statements with conditional clauses, will reduce on account 
of lookahead determine by one token (ELSE/WHEN) that that inner unterminated 
statement just ended. But what is hard to perceive, and maybe mostly easy to 
code, is that when we ascribe high priority to these tokens they will be able 
to repeatedly terminate  nested conditional statements within conditonal 
statements.


So what you have is that the ARITHMETIC statments clauses must have one set 
of allowable statements (a grouping) and the IF/WHEN clauses must have 
another set of allowable statements. These grouping, in effect, type the 
statements. 

The IF statement originally posted is an imperative statement, because of its 
explicit scope terminator. The enclosed arithmetic statements are conditional 
statements, these are permited because the ELSE token (or WHEN token) has no 
potential alternate interpretation: so before it shifts the inner arithmetic 
statments reduces (this is triggered by mere lookahead that determines that 
ELSE/WHEN can not glue on the current statement). And in these [THEN] and 
ELSE clasues any such inner conditional arithmetic statement can actually 
have arithmetic statements on its conditional clause as long as those 
arithmetic statements are 'imperative' on account of the absence of their own 
conditional clause or the presence of an explicit scope terminator. The whole 
shooting match still gets reduce when we encounter the ELSE/WHEN or END-IF or 
END-EVALUATE, which token then shifts.

When our compiler is in these thickets it will need to be able to 
countenance, and succesfully diagnose multiple conditional clauses; and IMHO 
also, separately note the transition of an inner second conditional 
("validly") to an outer arithmetic, as suspect, as having perhaps been 
intended for the inner arithmetic statement (which wasn't sticky because of 
the absence of an explicit scope terminator). All while deep in IF and 
EVALUATES nests. Ditto for I/O as for arithmetic.

So, ..., this is not just a bunch of words.  The ELSE terminates the inner 
arithmetic statement.  As it would an inner I/O statement. This matter is 
hierarchical.  ELSE is strong, it is a 'distinguished' token, as NIklaus 
Wirth calls this kind of thing in "Compiler Construction", in discussing 
Oberon.


So ELSE is strong enough to close out ADD and READ, etc. This is simply 
accomplished with the alternate rule sets.  The allowable statement type 
inclusions in the [THEN] clause is broader because the ELSE will end any 
interior 'conditional' statements (I am fairly sure that the nomenclature is 
then further assualted; those become imperatives in the sense that there 
scope is delimited - even though that would not be true on the branch of a 
conditional arithmetic statement!)

Note that in this sense ELSE is strong enough to close these other things out 
and trigger their reduction before the ELSE is shifted. You don't have to 
understand that to code the rules, you just need the alternate rule 
groupings, which you need any way to support the arithmetic statements and 
I/O statements. But the theory helps us, because ...

ADD terminates a previous ADD unless it occurs in the scope of a current 
conditiion branch.  An I/O verb, or a MOVE or a PERFORM terminates, by 
strength, a previous ADD (for example) unless it occurs in the scope of a 
current conditional clause. This would actually be accomplished in the 
grammar by structures of rules.  But is is useful to understand that the 
ELSE/EVALUATE will terminate the weaker conditional clause because it is 
strong enough to do so (that will require precedence or associativity). 

The IF token is not capable of terminating a currently open conditional 
clause, (neither an arithemtic nor I/O). For the IF statement itself can be 
viewed as yet another imperative statement that will glue within any current 
conditional clause (in this regard [THEN] and arithemtic conditional clauses 
are similar in their voratious appetite). ELSE and WHEN stop the hunger of 
arithmetic and I/O conditionals, but not the IF.

Any ELSE/WHEN closes everything back to its antecedent. THe antecedent of 
ELSE is the [THEN], but it is not strong enough to close the IF statement 
itself, the IF keeps growing. IF will only grow to the second part, the ELSE 
clause (which may be very complex). THE EVALUATE keeps growing and growing.

So, an ADD that has a conditional clause without an explicit scope 
termination _STILL_ becomes an imperative statement when a stronger 
distinguishing token terminates it. Since the ADD token is not stong enough 
to terminate an ADD statement, it can not turn the first, second, third, or 
any level of ADD statement conditional clause into an imperative form. The 
ADD token lacks the strength to terminate ADD.  ELSE and WHEN are strong 
enough. Furthermore WHEN clauses also terminate previous WHEN clauses: but 
the ELSE does not terminate the IF statement, and the WHEN does not terminate 
the EVALUATE statement.

Arithmetic and I/O verbs are not strong enough to terminate their own or 
eachothers conditional clauses.  They do each have enough strength to 
terminate unconditional varieties of either statement kind. So it is the 
conditional clause that is strong. IF, THEN, ELSE, EVALUATE, WHEN _clauses_ 
are strong clause. Their delimiting tokens burn through other contexts even 
if that other context is not explicitly scope delimited.

Most of that can be done with precedence and associativity, but it would be a 
morass if we do not group statements by type based on their complete topology.

As programmers we tend to thing of the procedure division as imperative code 
with controlled occassional conditional sub parts. Token processing, I think, 
really requires that we look at it the other way aroung. The procedure 
division is conditional code with zero, one or maybe more imperative blocks 
embedded therein.  

A special case is the conditional code that is surrounded by the 
unconditional token, thin air. No joke. Unconditional code is just a special 
case of conditional code. Unconditional code is a DO ALWAYS constuct.  Every 
block is like that.
Every branch of a conditiona clause of any type is an inline perform that 
happens to have the PERFORM ... and ... END-PERFORM delimiter absent because 
in this language, BEGIN and END, or { and }, are optional. Pascal
BEGIN END surround conditional and unconditional code blocks. The 
unconditional is just a variation.

COBOL, in its infinite wisdom, allows two kinds of conditionals. The first 
type has ordinality: these are the IF and EVALUATE statements, the parts of 
which need to be in some order (translation compiler writer and source code 
worker are conscious of associativity). The second type of conditional has no 
ordinality.  These tend to come in pairs hanging on some head pattern. The 
lack of ordinality completely defeats any use of associativity or token 
precedence to figure out intentions in nested conditions.

IMHO, the lack of ordinality is intentional, it frees the source code worker 
of concerns about associativity. Some view this as part of the politics. I 
don't I think that their is a brilliance to that.  The issue is the available 
work force. And their is just a guess as to what is more statistically 
probable; a coder who wants to transpose conditional clauses, or a worker who 
want to nest similar category statements.   Obviously there is a lot of need 
to nest IFs. Arithmetic statements  are another matter, and more so I/O.  
Making the conditional clauses transposable makes the source code worker more 
productive, because there are fewer recompiles (since they do not have to 
reorder those clauses). And nesting is allowed, with restrictions. They can 
have their cake and eat it too.

For that difference to be possible the ELSE and WHEN clauses must burn like 
acid throught _all_ current conditional branches.  That brings about atleast 
a small hierarchy in tokens.  Mostly just lookahead gets this done, but 
precedence and/or associativity will be involved. But most importantly, there 
is no reason at all that this distinguished token cannot terminate an 
erstwhile 'conditional' inner statement and its included 'imperative' 
statement and thereby turn the whole preceding block into an 'imperative' 
statement block. There is no confusion becasue ELSE does not look like ON 
SIZE ERROR, or NOT ON INVALID key. 

Now on the other hand, if someone proposes a NOT ELSE clause, we might have 
trouble.

Notice on this that the ELSE (WHEN), though it terminates all currently open 
conditional branches, it does not provide an _explicit_ scope terminator for 
the inner arithmetic statement on the conditional clause of an outer 
arithemtic statement (in say the [THEN] clause). To be 'imperative' 
statements in the block under the [THEN] clause any outer arithmetic statment 
must merely be delimitted (it can infact still have a conditional clause and 
lack an explicit scope terminator), but any inner arithmetic statement must, 
if it is to have a conditional clause, have an explicit scope terminator in 
order to become an imperative statement. But even if the inner statement does 
not become imperative, its containing outer statement can become imperative 
by delimiters that imply the termination of the outer arithmetic statement 
such as the high precedence ELSE (WHEN) token.

Modern coders tend to go out of their way in new code to prevent any 
ambiguities. And in my experience that ethic has risen up the ladder; 
managers and code reviewers are much more accomodating when the code has 
copious explicit delimiters.  But that is not the only thing out there. There 
is an awesome amount of legacy code which is a mixed bag. We will need to 
keep the compiler on its feet in some tough situations.

Although it happens as a transparency when using some tools, we will be 
warranted in buring through lower scopes when we encounter ELSE and WHEN, we 
don't need anything else to disambiguate it.

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com








Best Wishes
Bob Rayhawk
RKRayhawk@aol.com


--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.