[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: Hacks needed to parse COBOL



I basically agree with you about what the parser should do. My main
objective is to get the parser to parse into the correct structure so I
don't have to fix the structure up later; or worse the parser gets
confused and outputs a spurious error message. The hacks are just to
avoid. See below.

Tim Josling

"J & C Migrations, Pty." wrote:

> ... not everything needs to be done by the parser...
> Jonathan
> At 06:13 AM 12/2/99 +1000, Tim Josling wrote:
> ... (we need to know if something is an array or not)
>
>
> >/* problem 2.
> >   - after a qualified name, when hitting a "(" you can't tell if it is an
> array reference or a reference modification
> ...I think a reference may be:
>         if c(3) of b > 0 ...
> I may be mistaken.

A brief moment of *panic* later...

"The general format for identifier is:
data-name [of/in data-name]... [of/in file/cd/report]
[({subscript) . . . )] [(leftmost-character-position: [length])]"

Standard 4.3.8.4 - cut/paste didn't work so I had to slightly
paraphrase.

>
> >/* problem 4.
> >   - you need to know that there is a giving coming to know whether a
> literal is
> >valid after the to
> >   and whether rounded is permitted
>
> If a literal follows the TO or FROM in the arithmetic statement, can't you
> require a subsequent GIVING?  That won't require any look-ahead.

Thanks. I have fixed this - I don't need the hack.

> >/* problem 11.
> >   - you need infinite lookahead to determine whether you are expecting a
> >procedure name or an expression after 'perform'.
> >     You could look up the type in the symbol table but what about forward
> >references etc.
>
> I think that with a single look-ahead you can determine whether the word
> that follows the PERFORM is a reserved word ...

Consider:

01 n pic 9.

perform *n* times
    display ...
end-perform
.
perform *M*
        *n* times
.
goback.

Section M.
p1.

After the perform, you can have an identifier which can be a procedure
name or a data item. Both can be qualified to at least one extra level
(two more tokens) before you know which one it is. The end-perform tells
you that it is an in-line perform with a possible expression following,
othwise a procedure name must follow. I think this is the simplest
solution.

>
> J & C Migrations, Pty.

I found another problem fixing these ones. This comes from accommodating
nested conditional expressions.

add a to b 
    size error
       unstring
           on

After 'on' could be 'overflow' or 'size'.

If it's 'overflow', then the unstring continues, if 'size' then the
unstring is finished and should be reduced. This is an SR conflict but
shifting does not resolve it. We need to know what is coming.

So I have removed one hack and added another, but the parser is
improved, so thanks.

The 'section' alookahead problem also comes from accomodating the fact
that paragraph names may be missing from sections and the first sentence
in the procedure divison may have no paragraph name before it (IBM
supports this). In this case an identifier after a "section x ." Could
be the next section name or a paragraph name. There is an SR conflict
whether to reduce the optional paragraph or not. 

Tim Josling


--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.