NuriaProject Framework
0.1
The NuriaProject Framework
|
Storage of rules used by Nuria::Tokenizer. More...
#include <tokenizer.hpp>
Public Types | |
typedef std::function< bool(Token &, Tokenizer *) > | TokenAction |
enum | WhitespaceMode { AutoHandleWhitespace = 0, ManualWhitespaceHandling } |
Public Member Functions | |
TokenizerRules (WhitespaceMode mode=AutoHandleWhitespace) | |
TokenizerRules (const TokenizerRules &other) | |
~TokenizerRules () | |
void | addRegexToken (int tokenId, const QByteArray ®ularExpression) |
void | addRegexToken (int tokenId, const std::regex ®ularExpression) |
void | addStringToken (int tokenId, const QByteArray &terminal) |
TokenizerRules & | operator= (const TokenizerRules &other) |
void | setTokenAction (int tokenId, TokenAction action) |
void | setWhitespaceMode (WhitespaceMode mode) |
WhitespaceMode | whitespaceMode () const |
Friends | |
class | Tokenizer |
Storage of rules used by Nuria::Tokenizer.
A instance of this class define a rule-set for Nuria::Tokenizer, defining how to read the data stream.
Rules are not directly stored in Nuria::Tokenizer to allow for complex grammatics, demanding for multiple rule-sets. Please see Nuria::Tokenizer for usage details.
During matching, string tokens are tested first, meaning that string tokens take precedence over regular-expression ones. Matches are tried in the order they were added to the rule-set, meaning the first added rule will also first tried.
TokenizerRules can automatically handle whitespace, meaning that it's automatically skipped. This is the default setting.
typedef std::function< bool(Token &, Tokenizer *) > Nuria::TokenizerRules::TokenAction |
Typedef of a token action. See setTokenAction() for details.
Nuria::TokenizerRules::TokenizerRules | ( | WhitespaceMode | mode = AutoHandleWhitespace | ) |
Constructs a empty rule-set.
Nuria::TokenizerRules::TokenizerRules | ( | const TokenizerRules & | other | ) |
Copy constructor.
Nuria::TokenizerRules::~TokenizerRules | ( | ) |
Destructor.
void Nuria::TokenizerRules::addRegexToken | ( | int | tokenId, |
const QByteArray & | regularExpression | ||
) |
Adds a token matching regularExpression. This method is provided for convenience. It's equivalent to:
void Nuria::TokenizerRules::addRegexToken | ( | int | tokenId, |
const std::regex & | regularExpression | ||
) |
Adds a token matching the regularExpression.
void Nuria::TokenizerRules::addStringToken | ( | int | tokenId, |
const QByteArray & | terminal | ||
) |
Adds a token matching exactly terminal.
TokenizerRules& Nuria::TokenizerRules::operator= | ( | const TokenizerRules & | other | ) |
Assignment operator.
void Nuria::TokenizerRules::setTokenAction | ( | int | tokenId, |
TokenAction | action | ||
) |
Sets action as handler for all tokens of type tokenId. The handler will be called everytime a token of that type is encountered. The prototype of action is as follows:
The handler will be passed the token itself as mutable reference and the Tokenizer instance. On success, true
must be returned by the handler. If the handler returns false
, tokenizing will fail.
The handler may change the contents of the passed token. If the handler changes the tokenId of the token, the token action handler of that new tokenId will not be invoked.
void Nuria::TokenizerRules::setWhitespaceMode | ( | WhitespaceMode | mode | ) |
Sets the whitespace handling mode.
WhitespaceMode Nuria::TokenizerRules::whitespaceMode | ( | ) | const |
Returns the current whitespace handling mode.