libSBML Python API  5.10.0
 All Classes Namespaces Files Functions Variables Modules Pages
Working with mathematical expressions

This section describes libSBML's facilities for working with SBML representations of mathematical expressions.

Basic concepts

LibSBML uses Abstract Syntax Trees (ASTs) to provide a canonical, in-memory representation for all mathematical formulas regardless of their original format (i.e., C-like infix strings or MathML 2.0). In libSBML, an AST is a collection of one or more objects of class ASTNode.

Converting between ASTs and text strings

SBML Levels 2 and 3 represent mathematical expressions using using MathML 2.0 (more specifically, a subset of the content portion of MathML 2.0), but most applications using libSBML do not use MathML directly. Instead, applications generally either interact with mathematics in text-string form, or else they use the API for working with Abstract Syntax Trees (described below). LibSBML provides support for both approaches. The libSBML formula parser has been carefully engineered so that transformations from MathML to infix string notation and back is possible with a minimum of disruption to the structure of the mathematical expression.

The example below shows a simple program that, when run, takes a MathML string compiled into the program, converts it to an AST, converts that to an infix representation of the formula, compares it to the expected form of that formula, and finally translates that formula back to MathML and displays it. The output displayed on the terminal should have the same structure as the MathML it started with. The program is a simple example of using the various MathML and AST reading and writing methods, and shows that libSBML preserves the ordering and structure of the mathematical expressions.

import libsbml

expected = "1 + f(x)"

xml = "<?xml version='1.0' encoding='UTF-8'?>"\
      "<math xmlns='http://www.w3.org/1998/Math/MathML'>"\
      "  <apply> <plus/> <cn> 1 </cn>"\
      "                  <apply> <ci> f </ci> <ci> x </ci> </apply>"\
      "  </apply>"\
      "</math>"

ast    = libsbml.readMathMLFromString(xml)
result = libsbml.formulaToString(ast)

if (result == text):
    print "Got expected result"
else:
    print "Mismatch after readMathMLFromString()"

new_mathml = libsbml.parseFormula(result)
new_string = libsbml.writeMathMLToString(new_mathml)

print "Result of writing AST to string: "
print new_string

The text-string form of mathematical formulas produced by libsbml.formulaToString() and read by libsbml.parseFormula() are in a simple C-inspired infix notation. It is summarized in the next section below. A formula in this text-string form therefore can be handed to a program that understands SBML mathematical expressions, or used as part of a translation system. The libSBML distribution comes with an example program in the "examples" subdirectory called translateMath that implements an interactive command-line demonstration of translating infix formulas into MathML and vice-versa. In summary, the functions available are the following:

  • libsbml.formulaToString(ASTNode) reads an AST, converts it to a text string in SBML Level 1 formula syntax, and returns a string. The caller owns the character string returned and should free it after it is no longer needed.
  • libsbml.parseFormula(string) reads a text-string containing a mathematical expression in SBML Level 1 syntax, and returns an ASTNode object corresponding to the expression.
  • libsbml.parseL3Formula(string) reads a text-string containing a mathematical expression in an expanded syntax more compatible with SBML Levels 2 and 3, and returns an ASTNode object corresponding to the expression.

The string formula syntax and differences with MathML

The text-string formula syntax is an infix notation essentially derived from the syntax of the C programming language and was originally used in SBML Level 1. The formula strings may contain operators, function calls, symbols, and white space characters. The allowable white space characters are tab and space. The following are illustrative examples of formulas expressed in the syntax:

0.10 * k4^2
(vm * s1)/(km + s1)

The following table shows the precedence rules in this syntax. In the Class column, operand implies the construct is an operand, prefix implies the operation is applied to the following arguments, unary implies there is one argument, and binary implies there are two arguments. The values in the Precedence column show how the order of different types of operation are determined. For example, the expression a * b + c is evaluated as (a * b) + c because the * operator has higher precedence. The Associates column shows how the order of similar precedence operations is determined; for example, a - b + c is evaluated as (a - b) + c because the + and - operators are left-associative. The precedence and associativity rules are taken from the C programming language, except for the symbol ^, which is used in C for a different purpose. (Exponentiation can be invoked using either ^ or the function power.)

Token Operation Class Precedence Associates
namesymbol referenceoperand6n/a
(expression)expression groupingoperand6n/a
f(...)function callprefix6left
-negationunary5right
^powerbinary4left
*multiplicationbinary3left
/divisonbinary3left
+additionbinary2left
-subtractionbinary2left
,argument delimiterbinary1left
A table of the expression operators and their precedence in the text-string format for mathematical expressions used by SBML_parseFormula().

A program parsing a formula in an SBML model should assume that names appearing in the formula are the identifiers of Species, Parameter, Compartment, FunctionDefinition, (in Level 2) Reaction, or (in Level 3) SpeciesReference objects defined in a model. When a function call is involved, the syntax consists of a function identifier, followed by optional white space, followed by an opening parenthesis, followed by a sequence of zero or more arguments separated by commas (with each comma optionally preceded and/or followed by zero or more white space characters), followed by a closing parenthesis. There is an almost one-to-one mapping between the list of predefined functions available, and those defined in MathML. All of the MathML functions are recognized; this set is larger than the functions defined in SBML Level 1. In the subset of functions that overlap between MathML and SBML Level 1, there exist a few differences. The following table summarizes the differences between the predefined functions in SBML Level 1 and the MathML equivalents in SBML Levels 2 and 3:

Text string formula functions MathML equivalents in SBML Levels 2 and 3
acosarccos
asinarcsin
atanarctan
ceilceiling
logln
log10(x)log(10, x)
pow(x, y)power(x, y)
sqr(x)power(x, 2)
sqrt(x)root(2, x)
Table comparing the names of certain functions in the SBML text-string formula syntax and MathML. The left column shows the names of functions recognized by SBML_parseFormula(); the right column shows their equivalent function names in MathML 2.0, used in SBML Levels 2 and 3.

Methods for working with libSBML's Abstract Syntax Trees

There are a number of methods for interrogating the type of an ASTNode and for testing whether a node belongs to a general category of constructs. The methods on ASTNode for this purpose are the following:

  • long ASTNode.getType() returns the type of this AST node.
  • bool ASTNode.isConstant() returns True if this AST node is a MathML constant (True, False, pi, exponentiale), False otherwise.
  • bool ASTNode.isBoolean() returns True if this AST node returns a boolean value (by being either a logical operator, a relational operator, or the constant True or False).
  • bool ASTNode.isFunction() returns True if this AST node is a function (i.e., a MathML defined function such as exp or else a function defined by a FunctionDefinition in the Model).
  • bool ASTNode.isInfinity() returns True if this AST node is the special IEEE 754 value infinity.
  • bool ASTNode.isInteger() returns True if this AST node is holding an integer value.
  • bool ASTNode.isNumber() returns True if this AST node is holding any number.
  • bool ASTNode.isLambda() returns True if this AST node is a MathML lambda construct.
  • bool ASTNode.isLog10() returns True if this AST node represents the log10 function, specifically, that its type is AST_FUNCTION_LOG and it has two children, the first of which is an integer equal to 10.
  • bool ASTNode.isLogical() returns True if this AST node is a logical operator (and, or, not, xor).
  • bool ASTNode.isName() returns True if this AST node is a user-defined name or (in SBML Level 2) one of the two special csymbol constructs "delay" or "time".
  • bool ASTNode.isNaN() returns True if this AST node has the special IEEE 754 value "not a number" (NaN).
  • bool ASTNode.isNegInfinity() returns True if this AST node has the special IEEE 754 value of negative infinity.
  • bool ASTNode.isOperator() returns True if this AST node is an operator (e.g., +, -, etc.)
  • bool ASTNode.isPiecewise() returns True if this AST node is the MathML piecewise function.
  • bool ASTNode.isRational() returns True if this AST node is a rational number having a numerator and a denominator.
  • bool ASTNode.isReal() returns True if this AST node is a real number (specifically, AST_REAL_E or AST_RATIONAL).
  • bool ASTNode.isRelational() returns True if this AST node is a relational operator.
  • bool ASTNode.isSqrt() returns True if this AST node is the square-root operator
  • bool ASTNode.isUMinus() returns True if this AST node is a unary minus.
  • bool ASTNode.isUnknown() returns True if this AST node's type is unknown.

Programs manipulating AST node structures should check the type of a given node before calling methods that return a value from the node. The following are the ASTNode object methods available for returning values from nodes:

Of course, all of this would be of little use if libSBML didn't also provide methods for setting the values of AST node objects! And it does. The methods are the following:

  • ASTNode.setCharacter(char) sets the value of this ASTNode to the given character. If character is one of +, -, *, / or ^, the node type will be to the appropriate operator type. For all other characters, the node type will be set to AST_UNKNOWN.
  • ASTNode.setName(string) sets the value of this AST node to the given name. The node type will be set (to AST_NAME) only if the AST node was previously an operator (isOperator(node) != 0) or number (isNumber(node) != 0). This allows names to be set for AST_FUNCTIONs and the like.
  • ASTNode.setValue(int) sets the value of the node to the given integer. Equivalent to the next method.
  • ASTNode.setValue(long) sets the value of the node to the given integer.
  • ASTNode.setValue(long, long) sets the value of this ASTNode to the given rational in two parts: the numerator and denominator. The node type is set to AST_RATIONAL.
  • ASTNode.setValue(float) sets the value of this ASTNode to the given real (float) and sets the node type to AST_REAL.
  • ASTNode.setValue(float, long) sets the value of this ASTNode to the given real (float) in two parts: the mantissa and the exponent. The node type is set to AST_REAL_E.

Finally, ASTNode also defines some miscellaneous methods for manipulating ASTs:

Reading and Writing Mathematical Expressions into ASTs

As mentioned above, applications often can avoid working with raw MathML by using either libSBML's text-string interface or the AST API. However, when needed, reading MathML content directly and creating ASTs is easily done in libSBML using a method designed for this purpose:

  • ASTNode readMathMLFromString(string) reads raw MathML from a text string, constructs an AST from it, then returns the root ASTNode of the resulting expression tree.

Similarly, writing out Abstract Syntax Tree structures is easily done using the following method:

  • string writeMathMLToString(ASTNode) writes an AST to a string. The caller owns the character string returned and should free it after it is no longer needed.

The example program given above demonstrate the use of these methods.