[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnubol: distinguishing data refs and conditional refs
In a message dated 11/3/99 3:05:50 AM EST,
Randall Bart (Barticus@att.net) writes:
<<
Can you give me an example where data names and procedure names can occur
in the same context? Procedure names always occur after GO, PERFORM, INPUT
PROCEDURE, OUTPUT PROCDURE, ALTER (boo), or a period (when declared).
>>
I appreciate you responding to my posting. My orientation is to bullet-proof
the grammar with exacting specifications. The lexer can not always do that
alone.
You have used the word 'context' in your reply. We can each mean several
different things by that word. The tools used and discussed in this project
are generally 'context free' processing mechanisms, (comments here are
basically oriented to that sense of the word, although near the end I drop
back to a use that is less technical).
In a context free grammar there could be a rule such as
valid_ref : simple_ref
| qualified_ref
| subscripted_ref
| reference_modified_ref
| qualified_subscripted_reference_modified_ref
{ blip_cross_reference_list($1,line_number)
;
rules like simple_ref and qualified_ref would be reduced in a context free
environment. They do not know that they are occuring in our mind under the
potential reduction of a rule like
move_stmt : MOVE valid_ref TO valid_ref
....
or
perf_stmt : PERFORM valid_ref THRU valid_ref
.....
So I am proposing that the SYMT (symbol table) be used in stages beneath the
parser to characterize references very clearly. So that a data reference is
distinct from a procedure reference before it ever gets to a rule that
attempts to reduce it as valid syntactically.
Data references would return as ref_data tokens and could reduce to
valid_ref_data, and procedure references would return as ref_proc and could
reduce as valid_ref_proc.
Rules for data referencing verbs would (on first approach) only want to see
data references (et cetera).
for example
move_stmt : MOVE valid_data_ref TO valid_data_ref
{positive production}
or
perf_stmt : PERFORM valid_proc_ref THRU valid_proc_ref
{positive production}
An "example where data names and procedure names can occur
in the same context" is any program error that can produce that. For example
MOVE valid_data_ref TO valid_proc_ref
or
MOVE valid_proc_ref TO valid_data_ref
The 'context' in which we are compiling is not all possible valid programs,
but simply all possible programs. And there is a need to keep the compiler on
it's feet.
In other posts I have speculated about a need to have lexer states (for the
procedure division) to raise an assumption about references as being either
data references or procedure references, based upon the most recent verb
recognized. That is not a possible parser action as the ungoing rule for the
verb _has_not_reduced_yet_. States are dangerous exactly because of the
temporal asynchrony of lexer state and rule reduction. But there is a
possible need to be able to characterize references to _undefined_ items as
either (assumed) data references or (assumed) procedure references. States
_are_ context sensitive. The conundrum is that the lexer's state is ahead of
the parser (often).
With references to previously defined data items and (with a SYMT that
already contains procedure division section and paragraph names) previously
sensed procedure names, there is no need for the context sensitivity (lexer
state). We just need error productions like
rejected_move_stmt : MOVE valid_data_ref TO valid_proc_ref
{ error(line_number, " procedure name not valid in TO clause")
or
rejected_move_stmt : MOVE valid_data_ref TO valid_proc_ref
| MOVE valid_proc_ref TO valid_data_ref
| MOVE valid_proc_ref TO valid_proc_ref
{ error(line_number, " procedure name not valid in move statement")
But for _undefined_ items we made need to apply the assumption (in the lexer
or a filter beneath the parser).
Keep in mind that the reason for all of this is to thoroughly handle all of
the complexity of references down in the reference rules; exactly to keep it
out of the statement and expression rules (where we would just be multiplying
the problem and those rules are challenging enough). For example we do not
want to code determination and error handling of an invalid qualification (an
OF/IN clause group) inside the rule for MOVE and again for ADD. Same applies
for reference modification diagnosis (we do not want to do it again and again
in MOVE and INSPECT). Yet if we do reduce procedure references separately we
need a few error productions up at the higher level. As the
rejected_move_stmt rule suggests.
My examples have implied that the nodes passed up from reference reduction
rules contain an error flag member that can either be interrogated by the
statement and expression rules to short circuit code emission or to support
passing the error flag via a member in the AST structure to allow code
generation to be later halted (passing the buck) or to record that erroneous
code was ignored and the node is really a reference to s dummy (reference
substitution) to support complete source code compilation (but prevent
generation of invalid executables).
So trying to tie it all together, let me describe it this way.
If a reference is inherently invalid syntactically, that error should be
diagnosed in the rules for references. For example, reference modifying an
item that you cannot reference modify, or subscripting an item not dominated
by an occurs clause. This is to get the complexity out of the higher level
rules. But those rules (even the positive productions) must be prepared for
error flags or dummy references.
When a syntactically valid refence occurs incorrectly in a statement (that
is, it is 'out of context'), we probably need an error production to trap it.
Such error productions are parallel to the valid productions for those kind
of statements or expressions. These rules will need to turn on the sub-node
error flags and do substitutions to dummy references to support further
compilation.
Bob Rayhawk
RKRayhawk@aol.com
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.