SUMMARY: If you have a space character in your regular expression, even inside square brackets or quotes, you must put a backslash before it! This is not ideal; I consider the backslash to be a workaround. -- A. Appel Date: Tue, 29 Jul 1997 15:24:30 -0400 From: bryant@airmics.gatech.edu (Dr. Barrett Bryant) Message-Id: <199707291924.PAA10166@polk.airmics> To: appel@CS.Princeton.EDU Subject: Re: JLex Content-Type: X-sun-attachment ---------- X-Sun-Data-Type: text X-Sun-Data-Description: text X-Sun-Data-Name: text X-Sun-Charset: us-ascii X-Sun-Content-Lines: 15 The state table is not being constructed properly. I have an example, which is essentially an adaptation of the lex example in Aho, Sethi and Ullman. The regular expression for white space doesn't cause any state to be produced so the first blank is flagged as an error. An abstraction of this example is attached (in the form of the JLex source, a driver main program, test input, and typescript), with the same problem occurring. (You'll notice that the action corresponding to white space doesn't appear anywhere in the lexical analyzer.) By the way, this same problem occurs with a declaration of a whitespace regular definition or an (" "|"\t"|"\n") expression. Encountering this error, I didn't make it very far into my toy compiler. As I mentioned, I would like to use this in fall, so any corrected version that arises from the new project maintainer would be greatly appreciated. Also, please let me know if I am using the system in a totally wrong way. Thanks. ---------- X-Sun-Data-Type: default X-Sun-Data-Description: default X-Sun-Data-Name: Ws.jlex X-Sun-Charset: us-ascii X-Sun-Content-Lines: 18 %% %{ private void Echo () { System . out . print (yytext ()); } %} %integer %eofval{ return -1; %eofval} digit [0-9] letter [A-Za-z] %% {digit}+ { System . out . print ("integer "); } {letter}+ { System . out . print ("id "); } [ \t\n] { System . out . print ("white space '"); Echo (); System . out . print ("'"); } . { System . out . println ("illegal character"); System . exit (0); } ---------- X-Sun-Data-Type: default X-Sun-Data-Description: default X-Sun-Data-Name: Ws.java X-Sun-Charset: us-ascii X-Sun-Content-Lines: 10 class Ws { public static void main (String [] args) throws java.io.IOException { Yylex lexer = new Yylex (System . in); while (lexer . yylex () >= 0); } } ---------- X-Sun-Data-Type: default X-Sun-Data-Description: default X-Sun-Data-Name: test.txt X-Sun-Charset: us-ascii X-Sun-Content-Lines: 2 This is a test 0123456789 0123456789 OK ---------- X-Sun-Data-Type: default X-Sun-Data-Description: default X-Sun-Data-Name: typescript X-Sun-Charset: us-ascii X-Sun-Content-Lines: 8 Script started on Tue Jul 29 15:09:30 1997 ]lbryant@polk:/usr/lincoln1/cisd/bryant/java/pl0\polk% java Ws < test.txt java.lang.Error: Lexical Error: Unmatched Input. at Yylex.yylex(Ws.jlex.java:202) at Ws.main(Ws.java:7) polk% exit polk% script done on Tue Jul 29 15:09:55 1997