[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnubol: Re: which parser tool

To: gnu-cobol@lusars.net
Subject: Re: gnubol: Re: which parser tool
From: RKRayhawk@aol.com
Date: Thu, 25 Nov 1999 00:59:04 EST
Delivered-To: gnu-cobol-outgoing@wallace.lusars.net
Reply-To: gnu-cobol@lusars.net
Sender: owner-gnu-cobol@wallace.lusars.net

In a message dated 11/24/99 1:02:13 PM EST, TIMJOSLING@prodigy.net writes:

<< 
 tk_condition_name
 
 So it's a bit like a sed s/a/b/ statement. I an scan for this in the error
 messages and fix them up to meaningful values including removing tk_.
  >>

In addition to gravitating to an API for error processing, let me urge a 
complete and total externalization of error text.  Inside we should use error 
numbers, possibly enums (that can get lengthy).   All things like tokens and 
types, such as 'condition-name', should be reduced to encoded values, 
basically int's. 

Text outbound to the error processor should not actually be moved around, it 
will be hanging in trees, the tree needs to move to error process, not the 
particles of text.  Actually the details about the error can go on the tree 
too.  But it need not be perfectly clear in the begining.

Think of a 
    gnubol_error(file_id, line_number, error_code, detail_1, detail_2, 
AST_NODE);

where detail_1 and _2 are frequently NULL, error_code, could be a literal

    gnubol_error(file_id, line_number, 0305, COND_REF_TOKEN, NULL, AST_NODE);


before we get to far along we need to make commitments on that API so there 
is not too much to go back and change. It is not inconcevable that all the 
decorations would be on the 
AST_NODE, so 

    gnubol_error( AST_NODE);

with preceding sets of the error details into the members of AST_NODE.

Generally the API to error process can be sketch as identical to the API to 
semantics we just have two different flows, one is the road ahead, one is the 
ditch.

But early on, we outgh to utterly externalize text.  We should not be 
compiling error text. It should reside on a file that is accessed randomly, 
perhaps.  The tuples on the file can have resolvable printf/scanf % 
specifications: a %s to resolve a PROG_NAME in an error message, a %d to 
resolve error number itself which need not be stored in the file as it is 
implicit in the ordinality of the tuple, the value would have to be around 
because that will be how we look up error message text, %d for line number. 
We will need a mechanism to resolve conventionalized token values to text, we 
definitely do not need that up in the source code.

If we resolve token SECTION to integer value, say 397, then it is 397 that is 
passed around, expressed possibly as an enum like rw_SECTION, but it is a 
numeric, not text. Until the error process goes to work, and expands it by 
simple table look-up.

All of the above is IMHO, but this will save lots and lots of time. If you 
try to paste the error text together in production actions then in C you 
inevitably move the same stuff on and off the stack from global constants. In 
the IA-32 architecture that will exhaust registers pretty emphatically.  It 
is all unnecessary in any event. The error processor should be our text 
processor. The parser(s) should be really focused on the source code token 
stream.

And perhaps one aside that fits here but relates to a previous interaction.  
In the preprocess, complications we may wish to diagnose might very well span 
lines of source code. But generally, error messages need to refer to only one 
line. I would even consider coercing that restriction to keep the bandwidth 
small to error processing. A real difficult situation just turns into two 
error messages. That can happen as to API invocations, or some niffty thing 
in a tuple on the file that says, read the next error message too. But 
generally it may be feasible to think of the interaction with the compiler 
user as refering to only one line at a time. What I am after in this last 
comment is to continue to support the interest of keeping the lex to parser 
interface as simple as possible, and the error interface simple.

Generally, IMHO, we should not pass text through the error process API, 
except as it already resides in AST nodes (or similar trees). Everything else 
is conventionalized into integer, which may have residence in such trees 
bound for the error process.

Best Wishes
Bob Rayhawk
RKRayhawk@aol.com


--
This message was sent through the gnu-cobol mailing list.  To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body.  For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.

Follow-Ups:
- Re: gnubol: Re: which parser tool
  - From: Tim Josling <TIMJOSLING@prodigy.net>

Prev by Date: Re: gnubol: Re: which parser tool
Next by Date: Re: Parsing nested statements: was Re: gnubol: subsets
Prev by thread: Re: gnubol: Re: which parser tool
Next by thread: Re: gnubol: Re: which parser tool
Index(es):
- Date
- Thread