|
Absimpa v196 | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectabsimpa.lexer.SimpleLexer<N,C>
C - is an enumeration and describes the token codes provided to the
parser. In addition, the enum know how to transform a token code
into an NN - is the date type returned for a token when the parser has
recognized it and calles next()public class SimpleLexer<N,C extends java.lang.Enum<C>>
is an example implementation of a Lexer which analyzes a
string by trying out regular expressions for tokens until a match is
found. This is not intended for productive use. It is merely an example.
This lexer is set up by specifying a list of pairs (regex, C), where
C is some enumeration type, the generic parameter of this
class. To analyze an input string, the lexer tries to match each of the
regular expressions at the beginning of the input string. If it finds a
match, the associated C represents the current token code
provided to the parser. If next() is called, the matching prefix of
the input is converted to an N by means of the LeafFactory
implemented by the C type. The result is returned, while the lexer
starts over with the next token.
If no match can be found, the behaviour depends on whether
setSkipRe() was called. If yes, the regular expression
is tried, and if it matches, the corresponding text is ignored and the
lexer starts over trying to match the regular expressions. If the skip
regular expression does not match, a ParserException is thrown. If no
regular expression to skip was set, or if it was set to null, the
lexer behaves as if every non-matching character may be skipped.
Consequently, input that cannot be matched is then silently discarded.
| Constructor Summary | |
|---|---|
SimpleLexer(java.lang.Class<CC> tokenCode,
LeafFactory<N,C> leafFactory)
adds all constants found in class tokenCode with addToken(C, java.lang.String)
except if it is identical to the LexerInfo.eofCode() it provides. |
|
SimpleLexer(C eofCode,
LeafFactory<N,C> leafFactory)
creates a TrivialLexer to return eofCode
when the end of input is encountered. |
|
| Method Summary | |
|---|---|
SimpleLexer<N,C> |
addToken(C tc,
java.lang.String regex)
adds a mapping from a regular expression to the given token code. |
C |
current()
provides the current token code. |
java.lang.String |
currentText()
|
Token<N,C> |
currentToken()
returns the current token. |
void |
initAnalysis(java.lang.CharSequence text)
resets the lexer and initializes it to analyze the given text. |
N |
next()
discards the current token and advance to the next one. |
ParseException |
parseException(java.util.Set<C> expectedTokens)
creates a ParseException on request from the parser. |
void |
setSkipRe(java.lang.String regex)
|
java.lang.String |
toString()
|
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public SimpleLexer(C eofCode,
LeafFactory<N,C> leafFactory)
creates a TrivialLexer to return eofCode
when the end of input is encountered.
public SimpleLexer(java.lang.Class<CC> tokenCode,
LeafFactory<N,C> leafFactory)
adds all constants found in class tokenCode with addToken(C, java.lang.String)
except if it is identical to the LexerInfo.eofCode() it provides.
It is assumed, that toString() of a code returns a regular
expression that defines the strings representing the token.
IMPORTANT:Make sure to define the code constants of <C>
in the order you want the regular expressions tried out by the lexer.
| Method Detail |
|---|
public void initAnalysis(java.lang.CharSequence text)
throws ParseException
resets the lexer and initializes it to analyze the given
text. To prepare the first token, next() is
called internally.
ParseExceptionpublic void setSkipRe(java.lang.String regex)
public ParseException parseException(java.util.Set<C> expectedTokens)
Lexer
creates a ParseException on request from the parser. This method
is called by the parser if it finds a token code that does not fit its
grammar. It is up to the Lexer implementation to provide as much
information as possible in the exception about the current position of
the input.
parseException in interface Lexer<N,C extends java.lang.Enum<C>>expectedTokens - a set of tokens that the parser would have
expected at the current position.
public SimpleLexer<N,C> addToken(C tc,
java.lang.String regex)
adds a mapping from a regular expression to the given token code. No
provisions are taken to detect conflicting regular expressions, i.e.
regular expressions with common matches. To define a specific keyword,
e.g. package and also a general identifier, e.g.
[a-z]+, make sure to call addToken first
for the more specific token. Otherwise it will never be matched.
public C current()
Lexer
provides the current token code. This method must always return the
same token code as long as Lexer.next() is not called.
current in interface Lexer<N,C extends java.lang.Enum<C>>
public N next()
throws ParseException
discards the current token and advance to the next one. This may involve
skipping over input that cannot be matched by any regular expression
added with addToken(C, java.lang.String).
next in interface Lexer<N,C extends java.lang.Enum<C>>ParseException
ParseExceptionpublic Token<N,C> currentToken()
returns the current token.
public java.lang.String currentText()
public java.lang.String toString()
toString in class java.lang.Object
|
Absimpa v196 | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||