org.htmlparser.nodes

Class TextNode

Implemented Interfaces:
Cloneable, Node, Serializable, Text

public class TextNode
extends AbstractNode
implements Text

Normal text in the HTML document is represented by this class.

Field Summary

protected String
mText
The contents of the string node, or override text.

Fields inherited from class org.htmlparser.nodes.AbstractNode

children, mPage, nodeBegin, nodeEnd, parent

Constructor Summary

TextNode(String text)
Constructor takes in the text string.
TextNode(Page page, int start, int end)
Constructor takes in the page and beginning and ending posns.

Method Summary

void
accept(NodeVisitor visitor)
String visiting code.
String
getText()
Returns the text of the node.
boolean
isWhiteSpace()
Returns if the node consists of only white space.
void
setText(String text)
Sets the string contents of the node.
String
toHtml(boolean verbatim)
Returns the text of the node.
String
toPlainTextString()
Returns the text of the node.
String
toString()
Express this string node as a printable string This is suitable for display in a debugger or output to a printout.

Methods inherited from class org.htmlparser.nodes.AbstractNode

accept, clone, collectInto, doSemanticAction, getChildren, getEndPosition, getFirstChild, getLastChild, getNextSibling, getPage, getParent, getPreviousSibling, getStartPosition, getText, setChildren, setEndPosition, setPage, setParent, setStartPosition, setText, toHtml, toHtml, toPlainTextString, toString

Field Details

mText

protected String mText
The contents of the string node, or override text.

Constructor Details

TextNode

public TextNode(String text)
Constructor takes in the text string.
Parameters:
text - The string node text. For correct generation of HTML, this should not contain representations of tags (unless they are balanced).

TextNode

public TextNode(Page page,
                int start,
                int end)
Constructor takes in the page and beginning and ending posns.
Parameters:
page - The page this string is on.
start - The beginning position of the string.
end - The ending positiong of the string.

Method Details

accept

public void accept(NodeVisitor visitor)
String visiting code.
Specified by:
accept in interface Node
Overrides:
accept in interface AbstractNode
Parameters:
visitor - The NodeVisitor object to invoke visitStringNode() on.

getText

public String getText()
Specified by:
getText in interface Text
getText in interface Node
Overrides:
getText in interface AbstractNode
Returns:
The contents of this text node.

isWhiteSpace

public boolean isWhiteSpace()
Returns if the node consists of only white space. White space can be spaces, new lines, etc.

setText

public void setText(String text)
Sets the string contents of the node.
Specified by:
setText in interface Text
setText in interface Node
Overrides:
setText in interface AbstractNode
Parameters:
text - The new text for the node.

toHtml

public String toHtml(boolean verbatim)
Returns the text of the node.
Specified by:
toHtml in interface Node
Overrides:
toHtml in interface AbstractNode
Parameters:
verbatim - If true return as close to the original page text as possible.
Returns:
The contents of this text node.

toPlainTextString

public String toPlainTextString()
Specified by:
toPlainTextString in interface Node
Overrides:
toPlainTextString in interface AbstractNode
Returns:
The contents of this text node.

toString

public String toString()
Express this string node as a printable string This is suitable for display in a debugger or output to a printout. Control characters are replaced by their equivalent escape sequence and contents is truncated to 80 characters.
Specified by:
toString in interface Node
Overrides:
toString in interface AbstractNode
Returns:
A string representation of the string node.

HTML Parser is an open source library released under LGPL. SourceForge.net