Class FormulaParser<T>

java.lang.Object
io.jenetics.ext.internal.util.FormulaParser<T>
Type Parameters:
T - the token type used as input for the parser

public final class FormulaParser<T> extends Object
This class allows you to convert a sequence of tokens, which represents some kind of (mathematical) formula, into a tree structure. To do this, it is assumed that the given tokens can be categorized. The two main categories are structural tokens and operational tokens.

Structural tokens

Structural tokens are used to influence the hierarchy of the parsed tokens and are also part of function definitions. This kind of token will not be part of the generated tree representation.
  1. lparen: Represents left parentheses, which starts sub-trees or opens function argument lists.
  2. rparen: Represents right parentheses, which closes sub-trees or function argument lists. lparen and rparen must be balanced.
  3. comma: Separator token for function arguments.

Operational tokens

Operational tokens define the actual behaviour of the created tree.
  1. identifier: This kind of tokens usually represents variable names or numbers.
  2. function: Function tokens represents identifiers for functions. Valid functions have the following form: 'fun' 'lparen' arg ['comma' args]* 'rparen'
  3. binary operator: Binary operators are defined in infix order and have a precedence. Typical examples are the arithmetic operators '+' and '*', where the '*' have a higher precedence than '+'.
  4. unary operator: Unary operators are prefix operators. A typical example is the arithmetic negation operator '-'. Unary operators have all the same precedence, which is higher than the precedence of all binary operators.
This class is only responsible for the parsing step. The tokenization must be implemented separately. Another possible token source would be a generating grammar, where the output is already a list of tokens (aka sentence). The following example parser can be used to parse arithmetic expressions.
final FormulaParser<String> parser = FormulaParser.<String>builder() // Structural tokens. .lparen("(") .rparen(")") .separator(",") // Operational tokens. .unaryOperators("+", "-") .binaryOperators(ops -> ops .add(11, "+", "-") .add(12, "*", "/") .add(14, "^", "**")) .identifiers("x", "y", "z") .functions("pow", "sin", "cos") .build();
This parser allows you to parse the following token list
final List<String> tokens = List.of( "x", "*", "x", "+", "sin", "(", "z", ")", "-", "cos", "(", "x", ")", "+", "y", "/", "z", "-", "pow", "(", "z", ",", "x", ")" ); final Tree<String, ?> tree = parser.parse(tokens);
which will result in the following parsed tree:
"-" ├── "+" │ ├── "-" │ │ ├── "+" │ │ │ ├── "*" │ │ │ │ ├── "x" │ │ │ │ └── "x" │ │ │ └── "sin" │ │ │ └── "z" │ │ └── "cos" │ │ └── "x" │ └── "/" │ ├── "y" │ └── "z" └── "pow" ├── "z" └── "x"
Note that the generated (parsed) tree is of type Tree<String, ?>. To evaluate this tree, additional steps are necessary. If you want to create an executable tree, you have to use the parse(Iterable, TokenConverter) function for parsing the tokens.

The following code snippet shows how to create an executable AST from a token list. The MathExpr class in the io.jenetics.prog module uses a similar FormulaParser.TokenConverter.

final Tree<Op<Double>, ?> tree = formula.parse( tokens, (token, type) -> switch (token) { case "+" -> type == TokenType.UNARY_OPERATOR ? MathOp.ID : MathOp.ADD; case "-" -> type == TokenType.UNARY_OPERATOR ? MathOp.NEG : MathOp.SUB; case "*" -> MathOp.MUL; case "/" -> MathOp.DIV; case "^", "**", "pow" -> MathOp.POW; case "sin" -> MathOp.SIN; case "cos" -> MathOp.COS; default -> type == TokenType.IDENTIFIER ? Var.of(token); : throw new IllegalArgumentException("Unknown token: " + token); } );
Since:
7.1
Version:
7.1
Implementation Note:
This class is immutable and thread-safe.
  • Method Details

    • parse

      public <V> TreeNode<V> parse(Supplier<? extends T> tokens, FormulaParser.TokenConverter<? super T,? extends V> mapper)
      Parses the given token sequence according this formula definition. If the given tokens supplier returns null, no further token is available.
      Parameters:
      tokens - the tokens which form the formula
      mapper - the mapper function which maps the token type to the parse tree value type
      Returns:
      the parsed formula as a tree
      Throws:
      NullPointerException - if one of the arguments is null
      IllegalArgumentException - if the given tokens can't be parsed
    • parse

      public TreeNode<T> parse(Supplier<? extends T> tokens)
      Parses the given token sequence according this formula definition. If the given tokens supplier returns null, no further token is available.
      Parameters:
      tokens - the tokens which form the formula
      Returns:
      the parsed formula as a tree
      Throws:
      NullPointerException - if the arguments is null
      IllegalArgumentException - if the given tokens can't be parsed
    • parse

      public <V> TreeNode<V> parse(Iterable<? extends T> tokens, FormulaParser.TokenConverter<? super T,? extends V> mapper)
      Parses the given token sequence according this formula definition.
      Parameters:
      tokens - the tokens which form the formula
      mapper - the mapper function which maps the token type to the parse tree value type
      Returns:
      the parsed formula as a tree
      Throws:
      NullPointerException - if one of the arguments is null
      IllegalArgumentException - if the given tokens can't be parsed
    • parse

      public TreeNode<T> parse(Iterable<? extends T> tokens)
      Parses the given token sequence according this formula definition.
      Parameters:
      tokens - the tokens which form the formula
      Returns:
      the parsed formula as a tree
      Throws:
      NullPointerException - if the arguments is null
      IllegalArgumentException - if the given tokens can't be parsed
    • builder

      public static <T> FormulaParser.Builder<T> builder()
      Return a new builder class for building new formula parsers.
      Type Parameters:
      T - the token type
      Returns:
      a new formula parser builder