[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gnubol: problem 2.



In a message dated 12/22/99 6:19:34 PM EST, tej@melbpc.org.au writes:

<< 
 /* 
 
    problem 2.
    
    - after a qualified name, when hitting a "(" you can't tell if
 it
    is an array reference or a reference modification or a new
    expression (inside a function argument list)
    
   solution 2. 
   - change tokens in prescan as follows.
 
 /* from tk_left_parenthesis */
 
 %token tk_left_parenthesis_M_M_starts_refmod_M
  >>



  There are several problems here, and I am not sure I understand all of
  them. Taking perhaps just part of the set of problems:

I think we need to parse what we get, not what we want. I will start by 
putting put the function calls to the side.

A form like 
    data_name     

with no following parenthetical form must be easy to detect.

A form like
   data_name ( something )

also is easy to detect, and should be easy to distinguish from 
    data_name ( something something )

With prudent rules and precedence we should be able to keep this distinct from
  data_name ( something :  )
and
  data_name ( something : something )


We begin to see possible ambiguity when we allow the somethings to have 
optional add on
material (like increments for subscripts), and when we make the somethings 
into recursions (especially if we allow any recursion to have an /*empty*/ 
alternate that generates an epsilon).

So I guess I have to ask exactly what is the problem? And I am thinking there 
must be several.

Part of this gets easier if you accept that you must see all possible 
reference forms even where they do not belong.  Thus we have something like a 
set of low level rules that see all, and we gather them differently, as in 
accept_a_b_c/reject_d_e_f and just give me the result.


------
I would like you to site examples of the kind of problems you see in the 
function invocations.
---------

There is another idea available to you that could radically change your view. 
It may be best to consider every touch point in the standard defined grammar 
where a data reference can occur to actually be a place where you should 
anticipate a list of data names.

In a large number of places that seems senseless, but applying this approach 
has some potential advantages in complex constructs and can add marginally to 
the robustness of the parser in the many minor cases where a list of 
data_refs are not appropriates.

So assume that every data_ref is switched to a data_list in your grammar.  
Now in the rule
actions in many many cases, you test is_it_single_item, if yes send it to 
semantics, if it was multiple items issue a diagnostic (not much work, gain 
or loss so far),  

Now on the arithmetic statements things are more interesting.  You can have 
data lists in certain combinations on the positions where a data reference of 
some kind is called for.
INITIALIZE and SET also have this peculiarity.  So when you think about it, 
it is interesting that you also need to be able to stay on your feet when 
folk cram multiple data references into positions where there ought to be 
one. Hey, why not just always look for a list?

(by the by USING clauses are lists).


When you get home to the rule that ask for the subrules gathering activity 
open the bag and
see what you got.

So anyway that is just some radical thinking for you. You can see how that 
might help with the function calls, and allow us to construct a parser that 
can cope with future functions.

If I catch the drift of your concern about function parameters, since commas 
are not necessary, then
  FUNCTION funcname( something (something))
is not only 'apparently' ambiguous, it is absolutely ambiguous free of 
context. You simply can
not impose order on it with a context free grammar.  You must apply the rules 
associated
with the external definition of the function's signature to know if the 
source code is right. That is context dependence. Which is okay, I just would 
not let that paint you into any corners on
all the other dataname reference formats for this language.

There is a gray area here as to whether this is syntax or semantics.  But 
since the functions in this language are defined in standards it is possible 
to do this with hardcoded or table driven mechanisms at parse time.

There is an analogous problem with the EVALUATE ALSO phrases in WHEN clauses. 
 How many ALSO phrases should you have in a given situation.  Here we have a 
dynamic topology that is specified on the fly each time the coder flings out 
another EVALUATE statement. That is the ultimate in context dependence. This 
too will end up in the actions of the parser or in semantics.

Anyway, back to the mainline here. I think you can specify the simple data 
reference, the subscripted data reference, the reference modified data 
reference and combinations there of without ambiguity.  IMHO it is be to 
isolate the function invocation argument list and consider manual code as a 
final check of it.

It may be useful for you to begin looking at making every data reference a 
list of data references.


Best Wishes
Bob Rayhawk
RKRayhawk@aol.com




































--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.