[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: Re: Cost of Backtracking





RKRayhawk@aol.com wrote:

> (PCCTS is the chosen one.)

I am not convinced about PCCTS. It is more friendly in some respects but
you need to do a lot of tweaking.

> But the real parser must go after errors.

You are right that error productions tend to produce a huge number of
parser conflicts. But I am doubtful how much effort you put into
recovering from parse errors. 

These days I fix the first 2-3 errors and do another make. You
inevitably end up with a cascade of derived errors or repetitions after
a few. I was planning:

- "syntax error in statement starting at w, at x expecting a or b".
- panic mode recovery - skip to next convenient point (eg next verb or
period).

> Now send in some junk, such as evil stuff to conform to those rules, and some
> other
> stuff like
>  data (sub1) of data (sub2) of data (sub3) (refmod)

I think this will croak on the first "of". The array references all go
at the end.

> if you can stay on your feet at all push some of that to your current counts.
> See what happens.
> 
> Did you say 1000? Send in 50,000 or a 100,000 and have real apps going
> against your
> HD resources, be sure to add error message output from your own compiler to
> the resource drain picture. Typically business environments do not load COBOL
> compilers into idle CPUs.

Watching the program, it was linear in the size, although if you start
paging it will slow down dramatically. Given the price of memory I am
not putting any effort into saving memory. If it is not, you have an
explosion of conbinations. That's why you have to have YYACCEPT when you
can, to prune the tree of live options.

> Do try to extend the length of your data_names, you are cheating.

This is linear also. I agree if you start doing lots of IO it will be
slow. This woudl mainly happen if the tree of live options does not get
pruned.

> The use of PCCTS predicates is equivalent functionally to a hack in the lexer
> to parser interface.

Plus backtracking and multiple symbol lookahead.

> If the goal is a compiler and not a COBOL to C translator, _T_H_E_N_ our task
> _I_S_ error
> diagnosing, not valid program code gen.

I am not convinced on the need for elaborate error handling. I don't see
it in compilers I have used and as you point out so eloquently, it
causes a lot of grief. In a backtracking compiler you get an exponential
explosion. In an LALR(1) compiler, you get lots of conflicts.

You also had some comments about exporting the work from the parser into
manual checking. I *feel* that you can do better with manual checking,
provided the parser puts it into the right tree. But this view is based
on ignorance. I will have to learn by my own mistakes.

All this discussion is suggesting to me I should try a complete grammar
in bison and/or btyacc, to find all these issues. There must be heaps
more than 

(ref mod vs array; ref mod vs expression; 88 vs <80 level in condition;
section headers on their own line; copy/replace verb horrors;
comment-entries; strange rules about what can and cannot be continued
like == can't be split).

Tim Josling

> 
> Best Wishes,
> Bob Rayhawk
> RKRayhawk@aol.com


--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.