[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
gnubol: problem 2.
In a message dated 12/22/99 6:19:34 PM EST, tej@melbpc.org.au writes:
<<
/*
problem 2.
- after a qualified name, when hitting a "(" you can't tell if
it
is an array reference or a reference modification or a new
expression (inside a function argument list)
solution 2.
- change tokens in prescan as follows.
/* from tk_left_parenthesis */
%token tk_left_parenthesis_M_M_starts_refmod_M
>>
There are several problems here, and I am not sure I understand all of
them. Taking perhaps just part of the set of problems:
I think we need to parse what we get, not what we want. I will start by
putting put the function calls to the side.
A form like
data_name
with no following parenthetical form must be easy to detect.
A form like
data_name ( something )
also is easy to detect, and should be easy to distinguish from
data_name ( something something )
With prudent rules and precedence we should be able to keep this distinct from
data_name ( something : )
and
data_name ( something : something )
We begin to see possible ambiguity when we allow the somethings to have
optional add on
material (like increments for subscripts), and when we make the somethings
into recursions (especially if we allow any recursion to have an /*empty*/
alternate that generates an epsilon).
So I guess I have to ask exactly what is the problem? And I am thinking there
must be several.
Part of this gets easier if you accept that you must see all possible
reference forms even where they do not belong. Thus we have something like a
set of low level rules that see all, and we gather them differently, as in
accept_a_b_c/reject_d_e_f and just give me the result.
------
I would like you to site examples of the kind of problems you see in the
function invocations.
---------
There is another idea available to you that could radically change your view.
It may be best to consider every touch point in the standard defined grammar
where a data reference can occur to actually be a place where you should
anticipate a list of data names.
In a large number of places that seems senseless, but applying this approach
has some potential advantages in complex constructs and can add marginally to
the robustness of the parser in the many minor cases where a list of
data_refs are not appropriates.
So assume that every data_ref is switched to a data_list in your grammar.
Now in the rule
actions in many many cases, you test is_it_single_item, if yes send it to
semantics, if it was multiple items issue a diagnostic (not much work, gain
or loss so far),
Now on the arithmetic statements things are more interesting. You can have
data lists in certain combinations on the positions where a data reference of
some kind is called for.
INITIALIZE and SET also have this peculiarity. So when you think about it,
it is interesting that you also need to be able to stay on your feet when
folk cram multiple data references into positions where there ought to be
one. Hey, why not just always look for a list?
(by the by USING clauses are lists).
When you get home to the rule that ask for the subrules gathering activity
open the bag and
see what you got.
So anyway that is just some radical thinking for you. You can see how that
might help with the function calls, and allow us to construct a parser that
can cope with future functions.
If I catch the drift of your concern about function parameters, since commas
are not necessary, then
FUNCTION funcname( something (something))
is not only 'apparently' ambiguous, it is absolutely ambiguous free of
context. You simply can
not impose order on it with a context free grammar. You must apply the rules
associated
with the external definition of the function's signature to know if the
source code is right. That is context dependence. Which is okay, I just would
not let that paint you into any corners on
all the other dataname reference formats for this language.
There is a gray area here as to whether this is syntax or semantics. But
since the functions in this language are defined in standards it is possible
to do this with hardcoded or table driven mechanisms at parse time.
There is an analogous problem with the EVALUATE ALSO phrases in WHEN clauses.
How many ALSO phrases should you have in a given situation. Here we have a
dynamic topology that is specified on the fly each time the coder flings out
another EVALUATE statement. That is the ultimate in context dependence. This
too will end up in the actions of the parser or in semantics.
Anyway, back to the mainline here. I think you can specify the simple data
reference, the subscripted data reference, the reference modified data
reference and combinations there of without ambiguity. IMHO it is be to
isolate the function invocation argument list and consider manual code as a
final check of it.
It may be useful for you to begin looking at making every data reference a
list of data references.
Best Wishes
Bob Rayhawk
RKRayhawk@aol.com
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.