[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: New bison Grammar available (long)



RKRayhawk@aol.com wrote:

> Your layout of %prec precedence tokens makes all of them less than all tokens.
> That is a bit nonsensical, meant mildly (once you catch the drift of this
> you too will laugh). So if you tag anything with any of your %prec tokens,
> then everything will try to shift on. If that made sense, all you would need
> is
> 
> %nonassoc PREC_LESS_SHIFT_ANYTHING_JOB_DONE
> 
> and nothing more.
> 
> That is not going to work.
> 

As I understand it, the precedence is only needed and used where
there is a (SR or RR) conflict in the grammar.

For example

IF a
    display y
    if b
       display z
    else *> #1
       display z2
else
    display z
#fullstop

if d
    display xx
else #2
    display yy
end-if
.
        

At #1 there is a conflict. I have to force a shift (actually it
defaults to shift but it is bettet to be explicit).

At #2 there is no conflict.

In fact everywhere there is a SR conflict I want a shift - it
just happends to be that way in this case. So I think it is valid
that all the precedences are before the tokens and so are lower
priority. That way the tokens get shifted.

If there were cases where I wanted a reduce to resolve the
conflict, I would have to put the precedence 
token later on.

The precedence will not create a parse error where none exists.
So I think I am OK. I will test your example to make sure.

> 
> --------------------------------------------------
> Part II
> --------------------------------------------------

later

> An idealistic hierarchical grammar is
> going to tax resources.

G++ supports 418 nested levels with the default stack size, and
this can be overridden.

> It turns out to be greatly useful, IMHO, to recognize
> that early emits are fine, even in an LR tool endeavor!
> 

All the parser will do is build a tree. Code emits are a lot
later. This is to aid modularity. 

> When you break up the hierarchy, all pressure is removed from the parser's
> stack: no matter what the nesting level of actual source code.

The main problems with the stack would be:

- if you had left recursive grammars (or is it right?), which
case every element on a list gets pushed onto the stack. If you
do it the right way however, each one gets reduced at you go. In
reality you have to do it the right way because you get SR
conflicts with epsilon productions otherwise.

- if there are deeply nested programs. I don't think there is a
lot of this. Many of the compilers only allow a limited level of
nesting in any case. But you may need this for machine generated
code. Still I think the stack can be very large and if the g++ is
any guide this should not be a problem. 

> You may not be ready for that message.

Obviously not. But I have been warned.

> I am all for you. I believe that much will be done manually in
> the end.

In the data division it will be because parsing the nested levels
in bison is close to impossble.

> My intial concern was that optimistic hierarchies would tend to callapse
> upon unexpected input (ommissions and commissions). Posters tell me not
> to worry.

This is valid IMO. But if there is a dangling end-if within the
code, what can you actually do with it? You can't guess what the
programmer intended.

In my grammar the error recovery will panic recover to the next
verb, so all is not lost, including existing nested structure.

> The earliest posts in this
> project's discussions, seem to dismiss out of hand even considering a manual
> parser

I wasn't around then, but having written manual lexers and
parsers I wish to avoid this if possible.

- Requires lots of code => lots of bugs. The most precious
resource in any free software project is people's time so we must
optimise it.

- It is hard to 'see' the grammar in the code

> A manual lexer is probably considered rightly out
> of reasonable bounds, but a hybrid tool assisted manual parser may well be
> useful if not outright necessitated.

It is inevitable that part of the parsing is beyond bison eg in
the data division. Also the section/paragraph/sentence structure
is manual at present.

The only discussion is how far to push bison, in particular
whether it should do nested statements. At present I think it can
do them - so it will do them unless it fails in some respect.

Let's say I am wrong. Then we remove some of the rules and
delegate the task to the parser supervisor/cleaner-upper. Not a
lot of wasted code I think.


> (parser semantics interface)

I sketched out some ideas in my posting with the grammar. My
thinking is that it would pass a simplified parse tree back up.
Not all the productions would be reflected. The nesting would be
implied in the structure and there would be no noise words (like
end-if which once it determines the structure tells you nothing).
Lists would probably be in arrays (eg lists of children of a data
item, lists of statements such as in an on size error clause). It
would probably be cobol friendly.

...(a side note)

If you think COBOL is hard to parse, spare a though for those
working on fortran. It is far worse.

...

Rereading this it seems a bit terse. This is only to minimise
typing time.

Tim Josling

--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.