libSBML
"5.10.0"

LibSBML facilities for manipulating mathematical expressions

• Basic Concepts
• Converting between ASTs and Text Strings
• The String Formula Syntax and Differences with MathML
• Methods for working with libSBML's Abstract Syntax Trees
• Reading and Writing MathML from/to ASTs
This section describes libSBML's facilities for working with SBML representations of mathematical expressions. Unless otherwise noted, all classes are in the Java package org.sbml.libsbml.

Basic Concepts

LibSBML uses Abstract Syntax Trees (ASTs) to provide a canonical, in-memory representation for all mathematical formulas regardless of their original format (i.e., C-like infix strings or MathML 2.0). In libSBML, an AST is a collection of one or more objects of class ASTNode. An AST node in libSBML is a recursive structure containing a pointer to the node's value (which might be, for example, a number or a symbol) and a list of children nodes. Each ASTNode node may have none, one, two, or more child depending on its type. The following diagram illustrates an example of how the mathematical expression "1 + 2" is represented as an AST with one plus node having two integer children nodes for the numbers 1 and 2. The figure also shows the corresponding MathML 2.0 representation:

Example AST representation of a mathematical expression.
Infix AST MathML
1 + 2 <math xmlns="http://www.w3.org/1998/Math/MathML">
  <apply>
    <plus/>
    <cn type="integer"> 1 </cn>
    <cn type="integer"> 2 </cn>
  </apply>
</math>

The following are noteworthy about the AST representation in libSBML:

For many applications, the details of ASTs are irrelevant because the applications can use libSBML's text-string based translation functions such as libsbml.formulaToString(ASTNode), libsbml.parseFormula(java.lang.String) and libsbml.parseL3Formula(java.lang.String) If you find the complexity of using the AST representation of expressions too high for your purposes, perhaps the string-based functions will be more suitable.

Finally, it is worth noting that the AST and MathML handling code in libSBML remains written in C, not C++, as all of libSBML was originally written in C. Readers may occasionally wonder why some aspects are more C-like than following a C++ style, and that's the reason.

Converting between ASTs and Text Strings

SBML Levels 2 and 3 represent mathematical expressions using MathML 2.0 (more specifically, a subset of the content portion of MathML 2.0), but most software applications using libSBML do not use MathML directly. Instead, applications generally either interact with mathematics in text-string form, or else they use the API for working with Abstract Syntax Trees (described below). LibSBML provides support for both approaches. The libSBML formula parser has been carefully engineered so that transformations from MathML to infix string notation and back is possible with a minimum of disruption to the structure of the mathematical expression.

The example below shows a simple program that, when run, takes a MathML string compiled into the program, converts it to an AST, converts that to an infix representation of the formula, compares it to the expected form of that formula, and finally translates that formula back to MathML and displays it. The output displayed on the terminal should have the same structure as the MathML it started with. The program is a simple example of using the various MathML and AST reading and writing methods, and shows that libSBML preserves the ordering and structure of the mathematical expressions.

import org.sbml.libsbml.ASTNode;
import org.sbml.libsbml.libsbml;

public class example
{
  public static void main (String[] args)
  {        
      String expected = "1 + f(x)";
      String input_mathml = "<?xml version='1.0' encoding='UTF-8'?>" 
          + "<math xmlns='http://www.w3.org/1998/Math/MathML'>"
          + "  <apply> <plus/> <cn> 1 </cn>"
          + "                  <apply> <ci> f </ci> <ci> x </ci> </apply>"
          + "  </apply>"
          + "</math>";

      ASTNode ast_result   = libsbml.readMathMLFromString(input_mathml);
      String ast_as_string = libsbml.formulaToString(ast_result);

      if (ast_as_string.equals(expected))
      {
          System.out.println("Got expected result.");
      }
      else
      {
          System.out.println("Mismatch after readMathMLFromString().");
          System.exit(1);
      }

      ASTNode new_mathml = libsbml.parseFormula(ast_as_string);
      String new_string  = libsbml.writeMathMLToString(new_mathml);

      System.out.println("Result of writing AST to string:");
      System.out.print(new_string);
      System.out.println();
  }

  static 
  {
    try 
    {
      System.loadLibrary("sbmlj");
    }
    catch (Exception e)
    {
      System.err.println("Could not load libSBML library:" + e.getMessage());
    }
  }
}

The text-string form of mathematical formulas produced by libsbml.formulaToString(ASTNode) and read by libsbml.parseFormula(java.lang.String) and libsbml.parseL3Formula(java.lang.String) are in a simple C-inspired infix notation. It is summarized in the next section below. A formula in this text-string form therefore can be handed to a program that understands SBML mathematical expressions, or used as part of a translation system. In summary, the functions available are the following:

The String Formula Syntax and Differences with MathML

The text-string formula syntax is an infix notation essentially derived from the syntax of the C programming language and was originally used in SBML Level 1. The formula strings may contain operators, function calls, symbols, and white space characters. The allowable white space characters are tab and space. The following are illustrative examples of formulas expressed in the syntax:

0.10 * k4^2
(vm * s1)/(km + s1)

The following table shows the precedence rules in this syntax. In the Class column, operand implies the construct is an operand, prefix implies the operation is applied to the following arguments, unary implies there is one argument, and binary implies there are two arguments. The values in the Precedence column show how the order of different types of operation are determined. For example, the expression a * b + c is evaluated as (a * b) + c because the * operator has higher precedence. The Associates column shows how the order of similar precedence operations is determined; for example, a - b + c is evaluated as (a - b) + c because the + and - operators are left-associative. The precedence and associativity rules are taken from the C programming language, except for the symbol ^, which is used in C for a different purpose. (Exponentiation can be invoked using either ^ or the function power.)

Token Operation Class Precedence Associates
namesymbol referenceoperand6n/a
(expression)expression groupingoperand6n/a
f(...)function callprefix6left
-negationunary5right
^powerbinary4left
*multiplicationbinary3left
/divisonbinary3left
+additionbinary2left
-subtractionbinary2left
,argument delimiterbinary1left
A table of the expression operators and their precedence in the text-string format for mathematical expressions used by libsbml.parseFormula(java.lang.String).

A program parsing a formula in an SBML model should assume that names appearing in the formula are the identifiers of Species, Compartment, Parameter, FunctionDefinition, (in Level 2) Reaction, or (in Level 3) SpeciesReference objects defined in a model. When a function call is involved, the syntax consists of a function identifier, followed by optional white space, followed by an opening parenthesis, followed by a sequence of zero or more arguments separated by commas (with each comma optionally preceded and/or followed by zero or more white space characters), followed by a closing parenthesis. There is an almost one-to-one mapping between the list of predefined functions available, and those defined in MathML. All of the MathML funcctions are recognized; this set is larger than the functions defined in SBML Level 1. In the subset of functions that overlap between MathML and SBML Level 1, there exist a few differences. The following table summarizes the differences between the predefined functions in SBML Level 1 and the MathML equivalents in SBML Levels 2 and 3:

Text string formula functions MathML equivalents in SBML Levels 2 and 3
acosarccos
asinarcsin
atanarctan
ceilceiling
logln
log10(x)log(10, x)
pow(x, y)power(x, y)
sqr(x)power(x, 2)
sqrt(x)root(2, x)
Table comparing the names of certain functions in the SBML text-string formula syntax and MathML. The left column shows the names of functions recognized by libsbml.parseFormula(java.lang.String); the right column shows their equivalent function names in MathML 2.0, used in SBML Levels 2 and 3.

Methods for working with libSBML's Abstract Syntax Trees

Every ASTNode in a libSBML abstract syntax tree has an associated type, which is a value taken from a set of constants having names beginning with AST_ and defined in org.sbml.libsbml.libsbmlConstants. The list of possible AST types in libSBML is quite long, because it covers all the mathematical functions that are permitted in SBML. The values are shown in the following table; their names hopefully evoke the construct that they represent:

AST_CONSTANT_E AST_FUNCTION_COT AST_LOGICAL_NOT
AST_CONSTANT_FALSE AST_FUNCTION_COTH AST_LOGICAL_OR
AST_CONSTANT_PI AST_FUNCTION_CSC AST_LOGICAL_XOR
AST_CONSTANT_TRUE AST_FUNCTION_CSCH AST_MINUS
AST_DIVIDE AST_FUNCTION_DELAY AST_NAME
AST_FUNCTION AST_FUNCTION_EXP AST_NAME_AVOGADRO (Level 3 only)
AST_FUNCTION_ABS AST_FUNCTION_FACTORIAL AST_NAME_TIME
AST_FUNCTION_ARCCOS AST_FUNCTION_FLOOR AST_PLUS
AST_FUNCTION_ARCCOSH AST_FUNCTION_LN AST_POWER
AST_FUNCTION_ARCCOT AST_FUNCTION_LOG AST_RATIONAL
AST_FUNCTION_ARCCOTH AST_FUNCTION_PIECEWISE AST_REAL
AST_FUNCTION_ARCCSC AST_FUNCTION_POWER AST_REAL_E
AST_FUNCTION_ARCCSCH AST_FUNCTION_ROOT AST_RELATIONAL_EQ
AST_FUNCTION_ARCSEC AST_FUNCTION_SEC AST_RELATIONAL_GEQ
AST_FUNCTION_ARCSECH AST_FUNCTION_SECH AST_RELATIONAL_GT
AST_FUNCTION_ARCSIN AST_FUNCTION_SIN AST_RELATIONAL_LEQ
AST_FUNCTION_ARCSINH AST_FUNCTION_SINH AST_RELATIONAL_LT
AST_FUNCTION_ARCTAN AST_FUNCTION_TAN AST_RELATIONAL_NEQ
AST_FUNCTION_ARCTANH AST_FUNCTION_TANH AST_TIMES
AST_FUNCTION_CEILING AST_INTEGER AST_UNKNOWN
AST_FUNCTION_COS AST_LAMBDA
AST_FUNCTION_COSH AST_LOGICAL_AND

There are a number of methods for interrogating the type of an ASTNode and for testing whether a node belongs to a general category of constructs. The methods defined by the ASTNode class are the following:

Programs manipulating AST node structures should check the type of a given node before calling methods that return a value from the node. The following methods are available for returning values from nodes:

Of course, all of this would be of little use if libSBML didn't also provide methods for setting the values of AST node objects! And it does. The methods are the following:

Finally, ASTNode also defines some miscellaneous methods for manipulating

Reading and Writing MathML from/to ASTs

As mentioned above, applications often can avoid working with raw MathML by using either libSBML's text-string interface or the AST API. However, when needed, reading MathML content directly and creating ASTs, as well as the converse task of writing MathML, is easily done using two methods designed for this purpose:

The example program given above demonstrate the use of these methods.


libSBML
"5.10.0"


LibSBML "5.10.0", an application programming interface (API) library for SBML.