[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gnubol: Re: which parser tool
In a message dated 11/24/99 1:02:13 PM EST, TIMJOSLING@prodigy.net writes:
<<
tk_condition_name
So it's a bit like a sed s/a/b/ statement. I an scan for this in the error
messages and fix them up to meaningful values including removing tk_.
>>
In addition to gravitating to an API for error processing, let me urge a
complete and total externalization of error text. Inside we should use error
numbers, possibly enums (that can get lengthy). All things like tokens and
types, such as 'condition-name', should be reduced to encoded values,
basically int's.
Text outbound to the error processor should not actually be moved around, it
will be hanging in trees, the tree needs to move to error process, not the
particles of text. Actually the details about the error can go on the tree
too. But it need not be perfectly clear in the begining.
Think of a
gnubol_error(file_id, line_number, error_code, detail_1, detail_2,
AST_NODE);
where detail_1 and _2 are frequently NULL, error_code, could be a literal
gnubol_error(file_id, line_number, 0305, COND_REF_TOKEN, NULL, AST_NODE);
before we get to far along we need to make commitments on that API so there
is not too much to go back and change. It is not inconcevable that all the
decorations would be on the
AST_NODE, so
gnubol_error( AST_NODE);
with preceding sets of the error details into the members of AST_NODE.
Generally the API to error process can be sketch as identical to the API to
semantics we just have two different flows, one is the road ahead, one is the
ditch.
But early on, we outgh to utterly externalize text. We should not be
compiling error text. It should reside on a file that is accessed randomly,
perhaps. The tuples on the file can have resolvable printf/scanf %
specifications: a %s to resolve a PROG_NAME in an error message, a %d to
resolve error number itself which need not be stored in the file as it is
implicit in the ordinality of the tuple, the value would have to be around
because that will be how we look up error message text, %d for line number.
We will need a mechanism to resolve conventionalized token values to text, we
definitely do not need that up in the source code.
If we resolve token SECTION to integer value, say 397, then it is 397 that is
passed around, expressed possibly as an enum like rw_SECTION, but it is a
numeric, not text. Until the error process goes to work, and expands it by
simple table look-up.
All of the above is IMHO, but this will save lots and lots of time. If you
try to paste the error text together in production actions then in C you
inevitably move the same stuff on and off the stack from global constants. In
the IA-32 architecture that will exhaust registers pretty emphatically. It
is all unnecessary in any event. The error processor should be our text
processor. The parser(s) should be really focused on the source code token
stream.
And perhaps one aside that fits here but relates to a previous interaction.
In the preprocess, complications we may wish to diagnose might very well span
lines of source code. But generally, error messages need to refer to only one
line. I would even consider coercing that restriction to keep the bandwidth
small to error processing. A real difficult situation just turns into two
error messages. That can happen as to API invocations, or some niffty thing
in a tuple on the file that says, read the next error message too. But
generally it may be feasible to think of the interaction with the compiler
user as refering to only one line at a time. What I am after in this last
comment is to continue to support the interest of keeping the lex to parser
interface as simple as possible, and the error interface simple.
Generally, IMHO, we should not pass text through the error process API,
except as it already resides in AST nodes (or similar trees). Everything else
is conventionalized into integer, which may have residence in such trees
bound for the error process.
Best Wishes
Bob Rayhawk
RKRayhawk@aol.com
--
This message was sent through the gnu-cobol mailing list. To remove yourself
from this mailing list, send a message to majordomo@lusars.net with the
words "unsubscribe gnu-cobol" in the message body. For more information on
the GNU COBOL project, send mail to gnu-cobol-owner@lusars.net.