[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gnubol: Re: ref touch points

To: gnu-cobol@lusars.net
Subject: gnubol: Re: ref touch points
From: RKRayhawk@aol.com
Date: Fri, 3 Dec 1999 10:48:21 EST
Delivered-To: gnu-cobol-outgoing@wallace.lusars.net
Reply-To: gnu-cobol@lusars.net
Sender: owner-gnu-cobol@wallace.lusars.net

In a message dated 12/3/99 8:19:26 AM EST, mck@tivoli.mv.com writes:

<< 
 procedure_ref
    : PROC_NAME { OF_IN PROC_NAME }
    ;
  >>
This is a situation in which certain minimum increases in size of the rule 
set can take some work out of semantics: consider alternatively a pattern like

 procedure_ref
    : procedure_ref_okay
     | procedure_ref_not_okay
    ;

 procedure_ref_okay
    : PROC_NAME { OF_IN PROC_NAME }
    ;


procedure_ref_not_okay
   :  DATA_NAME { OF_IN PROC_NAME }
    |  PROC_NAME OF_IN DATA_NAME
    ; {/*set a commutable error switch, let rule parsing continue*/
 

If we can distinguish paragraph names from section names before the procedure 
division scan (as in preprocess) then, _okay begins to look even more tightly 
constrained to perhaps,

 procedure_ref_okay
    : PARA_NAME { OF_IN SECT_NAME }
      SECT_NAME
;

and the not okay rule gets some more precision as well

procedure_ref_not_okay
   :  DATA_NAME { OF_IN PARA_NAME }
  |  PARA_NAME OF_IN DATA_NAME
  | SECT_NAME OF_IN SECT_NAME   
  | SECT_NAME OF_IN PARA_NAME   
 etc.  
 ; {/*set a commutable error switch, let rule parsing continue*/


It could also be _possible_ to note whether a paragraph is even in a section 
during the function that initializes the symbol table (that is definitely 
easy).
So if S_PARA_NAME token indicates that it is atleast possible for the 
paragraph name to be IN a section, and NOTIN_S_PARA_NAME meant that the 
paragraph occured in either the area preceeding the first section name or the 
whole program has only paragraph names and no section names then

 procedure_ref_okay
    : S_PARA_NAME { OF_IN SECT_NAME }
    : NOTIN_S_PARA_NAME
      | SECT_NAME   
;

and among other things the not okay can now be expanded to include


procedure_ref_not_okay
   : ...
  | NOTIN_S_PARA_NAME OF_IN SECT_NAME
 etc.  
 ; {/*set a commutable error switch, let rule parsing continue*/

We could get good at this.  To clarify a point that I have tried to make 
before, but apparently seems too abstract the way I have previously phrased 
it. If we do not do this with syntax rules we must do it in the actions or in 
semantics. The work must be done. If we do not allow this linear expansion of 
the syntax rules we have merely delayed the work activity to a later phase of 
project gnubol.  Regrettably, if we do it in the actions we will have very 
complex code and no matter what we extract from the symbol tree to make the 
determinations it will be too late to effect rule selection, because we will 
be in the action already. That point is a major design consideration: how 
much of the damaged reference problem do we want to manage with rules.

The basic idea that at any given touch point where the compiler hits 
references we have a rule that looks like a single symbol to the verb 
construct rule, but nearly always is a split down below into _okay and 
_not_okay.  That puts us in business to elaborate any useful alternatives 
that involve just sequence problems like an index reference of the pattern
   dataref (1) (2) (3), 
where we should have
   dataref (1 2 3). 

This is a major payoff. Many problems that would otherwise flyout as error to 
some mother rule up above now simply reside quietly down below flipping 
error_flags to guide emissions to semantics. We do not have to be infinitely 
clear on exactly how far we will take this to get on the basic path.

I would eventually take this very far to keep problems out of semantics.  No 
one else would agree with this at this time, but if you have the basic 
_okay/_not_okay split on all reference landing pods you can add precision 
over time.  If we can distinguish reference modifiable items from those upon 
which reference modification is illegal we can trap inappropriate reference 
modifications in the rules and semantics does not have to deal with it. For 
example I believe it is not reasonable to reference modify a pointer 
elementary item. By the time that we reach the reference in the procedure 
division everything we need to know for this is present in the symbol table; 
the lexer or an intermediate filter can burp it up if we need it.

So the sale is this; split every reference point into _okay/_not_okay 
subrules, elaborate as time permits. Without being intricate on reference 
modifiability or subscriptability in the beginning we can easily do 
no-brainers, like references to collating sequences are certainly not 
appropriate as procedure names in the PERFORM statements. Eventually we can 
get 
around to detail work in the data references. The binary attributes of :
    reference modifiable / not reference modifiable
    subscriptable / not subscriptable
    qualifiable / not qualifiable
are definitely discernable as of completion of the data division parse. We 
can easily write syntax rules that will scoop junk up out of the code and 
relieve the burden of semantics.

Perhaps it is a reasonable guess that subscripting and reference modification 
interpretation is best left to semantics (yet actually in some instances that 
is statically expressed via constants, but why code it twice, the dynamic 
variation has to be in semantics).

However, it is likewise reasonable to suggest that qualification can and 
perhaps should be accomplished in syntax. Qualification actually involves a 
tree walk of portions of the symbol table. That can occur as the {OFIN ref} 
recurse reduces step by step, or at the single action where the reference is 
reduced, or in the various verb constructs that real in the references.

The step by step synchronous walk makes the most sense, in effect the grammar 
parse engine is enabling the walk in exactly the correct fashion. Why waste 
that opportunity. So anyway, if you do manifest type according to a few 
distinctions for rule identification of syntax errors, you are incidentally 
enabling qualification _resolution_ in syntax and providing atleast for the 
possibility of eliminating that from semantics.

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com

--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.

Follow-Ups:
- Re: gnubol: Re: ref touch points
  - From: Michael McKernan <mck@tivoli.mv.com>

Prev by Date: RE: gnubol: How do we parse this language, anyway?
Next by Date: Re: PERFORM token (was RE: gnubol: Hacks needed to parse COBOL
Prev by thread: Re: PERFORM token (was RE: gnubol: Hacks needed to parse COBOL
Next by thread: Re: gnubol: Re: ref touch points
Index(es):
- Date
- Thread