org.htmlparser
Interface NodeFactory
- Lexer, PrototypicalNodeFactory
public interface NodeFactory
This interface defines the methods needed to create new nodes.
The factory is used when lexing to generate the nodes passed
back to the caller. By implementing this interface, and setting
that concrete object as the node factory for the
lexer
(perhaps via the
parser
), the way that nodes are generated
can be customized.
In general, replacing the factory with a custom factory is not required
because of the flexibility of the
PrototypicalNodeFactory
.
Creation of Text and Remark nodes is straight forward, because essentially
they are just sequences of characters extracted from the page. Creation of a
Tag node requires that the attributes from the tag be remembered as well.
createRemarkNode
public Remark createRemarkNode(Page page,
int start,
int end)
throws ParserException
Create a new remark node.
page
- The page the node is on.start
- The beginning position of the remark.end
- The ending positiong of the remark.
- A remark node comprising the indicated characters from the page.
ParserException
- If there is a problem encountered
when creating the node.
createStringNode
public Text createStringNode(Page page,
int start,
int end)
throws ParserException
Create a new text node.
page
- The page the node is on.start
- The beginning position of the string.end
- The ending positiong of the string.
- A text node comprising the indicated characters from the page.
ParserException
- If there is a problem encountered
when creating the node.
createTagNode
public Tag createTagNode(Page page,
int start,
int end,
Vector attributes)
throws ParserException
Create a new tag node.
Note that the attributes vector contains at least one element,
which is the tag name (standalone attribute) at position zero.
This can be used to decide which type of node to create, or
gate other processing that may be appropriate.
page
- The page the node is on.start
- The beginning position of the tag.end
- The ending positiong of the tag.attributes
- The attributes contained in this tag.
- A tag node comprising the indicated characters from the page.
ParserException
- If there is a problem encountered
when creating the node.
| © 2005 Derrick Oswald Mai 08, 2008 |
HTML Parser is an open source library released under LGPL. |  |