HTML Parser Home Page | |
Prev Class | Next Class | Frames | No Frames |
Summary: Nested | Field | Method | Constr | Detail: Nested | Field | Method | Constr |
java.lang.Object
org.htmlparser.nodes.AbstractNode
org.htmlparser.nodes.TagNode
Field Summary | |
protected static Hashtable |
|
protected Vector |
|
protected static Scanner |
|
Fields inherited from class org.htmlparser.nodes.AbstractNode | |
children , mPage , nodeBegin , nodeEnd , parent |
Constructor Summary | |
| |
|
Method Summary | |
void |
|
boolean |
|
String |
|
Attribute |
|
Vector |
|
Tag |
|
String[] |
|
String[] |
|
int |
|
String[] |
|
String |
|
int |
|
int |
|
int |
|
String |
|
String |
|
Scanner |
|
boolean |
|
boolean |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void |
|
void | |
void |
|
void |
|
void |
|
void |
|
void |
|
String |
|
String |
|
String |
|
Methods inherited from class org.htmlparser.nodes.AbstractNode | |
accept , clone , collectInto , doSemanticAction , getChildren , getEndPosition , getFirstChild , getLastChild , getNextSibling , getPage , getParent , getPreviousSibling , getStartPosition , getText , setChildren , setEndPosition , setPage , setParent , setStartPosition , setText , toHtml , toHtml , toPlainTextString , toString |
protected static Hashtable breakTags
Set of tags that breaks the flow.
protected Vector mAttributes
The tag attributes. Objects of typeAttribute
. The first element is the tag name, subsequent elements being either whitespace or real attributes.
protected static final Scanner mDefaultScanner
The default scanner for non-composite tags.
public TagNode()
Create an empty tag.
public TagNode(Page page, int start, int end, Vector attributes)
Create a tag with the location and attributes provided
- Parameters:
page
- The page this tag was read from.start
- The starting offset of this node within the page.end
- The ending offset of this node within the page.attributes
- The list of attributes that were parsed in this tag.
- See Also:
Attribute
public TagNode(TagNode tag, TagScanner scanner)
Create a tag like the one provided.
- Parameters:
tag
- The tag to emulate.scanner
- The scanner for this tag.
public void accept(NodeVisitor visitor)
Default tag visiting code. Based onisEndTag()
, calls eithervisitTag()
orvisitEndTag()
.
- Overrides:
- accept in interface AbstractNode
- Parameters:
visitor
- The visitor that is visiting this node.
public boolean breaksFlow()
Determines if the given tag breaks the flow of text.
- Specified by:
- breaksFlow in interface Tag
- Returns:
true
if following text would start on a new line,false
otherwise.
public String getAttribute(String name)
Returns the value of an attribute.
- Specified by:
- getAttribute in interface Tag
- Parameters:
name
- Name of attribute, case insensitive.
- Returns:
- The value associated with the attribute or null if it does not exist, or is a stand-alone or
public Attribute getAttributeEx(String name)
Returns the attribute with the given name.
- Specified by:
- getAttributeEx in interface Tag
- Parameters:
name
- Name of attribute, case insensitive.
- Returns:
- The attribute or null if it does not exist.
public Vector getAttributesEx()
Gets the attributes in the tag.
- Specified by:
- getAttributesEx in interface Tag
- Returns:
- Returns the list of
Attributes
in the tag. The first element is the tag name, subsequent elements being either whitespace or real attributes.
public Tag getEndTag()
Get the end tag for this (composite) tag. For a non-composite tag this always returnsnull
.
- Returns:
- The tag that terminates this composite tag, i.e. </HTML>.
public String[] getEndTagEnders()
Return the set of end tag names that cause this tag to finish. These are the end tags that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, it has no end tag enders.
- Specified by:
- getEndTagEnders in interface Tag
- Returns:
- The names of following end tags that stop further scanning.
public String[] getEnders()
Return the set of tag names that cause this tag to finish. These are the normal (non end tags) that if encountered while scanning (a composite tag) will cause the generation of a virtual tag. Since this a a non-composite tag, the default is no enders.
- Returns:
- The names of following tags that stop further scanning.
public int getEndingLineNumber()
Get the line number where this tag ends.
- Specified by:
- getEndingLineNumber in interface Tag
- Returns:
- The (zero based) line number in the page where this tag ends.
public String[] getIds()
Return the set of names handled by this tag. Since this a a generic tag, it has no ids.
- Returns:
- The names to be matched that create tags of this type.
public String getRawTagName()
Return the name of this tag.
- Specified by:
- getRawTagName in interface Tag
- Returns:
- The tag name or null if this tag contains nothing or only whitespace.
public int getStartingLineNumber()
Get the line number where this tag starts.
- Specified by:
- getStartingLineNumber in interface Tag
- Returns:
- The (zero based) line number in the page where this tag starts.
public int getTagBegin()
Gets the nodeBegin.
- Returns:
- The nodeBegin value.
public int getTagEnd()
Gets the nodeEnd.
- Returns:
- The nodeEnd value.
public String getTagName()
Return the name of this tag. Note: This value is converted to uppercase and does not begin with "/" if it is an end tag. Nor does it end with a slash in the case of an XML type tag. To get at the original text of the tag name usegetRawTagName()
. The conversion to uppercase is performed with an ENGLISH locale.
- Specified by:
- getTagName in interface Tag
- Returns:
- The tag name.
public String getText()
Return the text contained in this tag.
- Overrides:
- getText in interface AbstractNode
- Returns:
- The complete contents of the tag (within the angle brackets).
public Scanner getThisScanner()
Return the scanner associated with this tag.
- Specified by:
- getThisScanner in interface Tag
- Returns:
- The scanner associated with this tag.
public boolean isEmptyXmlTag()
Is this an empty xml tag of the form <tag/>.
- Specified by:
- isEmptyXmlTag in interface Tag
- Returns:
- true if the last character of the last attribute is a '/'.
public boolean isEndTag()
Predicate to determine if this tag is an end tag (i.e. </HTML>).
- Returns:
true
if this tag is an end tag.
public void removeAttribute(String key)
Remove the attribute with the given key, if it exists.
- Specified by:
- removeAttribute in interface Tag
- Parameters:
key
- The name of the attribute.
public void setAttribute(String key, String value)
Set attribute with given key, value pair. Figures out a quote character to use if necessary.
- Specified by:
- setAttribute in interface Tag
- Parameters:
key
- The name of the attribute.value
- The value of the attribute.
public void setAttribute(String key, String value, char quote)
Set attribute with given key, value pair where the value is quoted by quote.
- Specified by:
- setAttribute in interface Tag
- Parameters:
key
- The name of the attribute.value
- The value of the attribute.quote
- The quote character to be used around value. If zero, it is an unquoted value.
public void setAttribute(Attribute attribute)
Set an attribute. This replaces an attribute of the same name. To set the zeroth attribute (the tag name), use setTagName().
- Parameters:
attribute
- The attribute to set.
public void setAttributeEx(Attribute attribute)
Set an attribute.
- Specified by:
- setAttributeEx in interface Tag
- Parameters:
attribute
- The attribute to set.
- See Also:
setAttribute(Attribute)
public void setAttributesEx(Vector attribs)
Sets the attributes. NOTE: Values of the extended hashtable are two element arrays of String, with the first element being the original name (not uppercased), and the second element being the value.
- Specified by:
- setAttributesEx in interface Tag
- Parameters:
attribs
- The attribute collection to set.
public void setEmptyXmlTag(boolean emptyXmlTag)
Set this tag to be an empty xml node, or not. Adds or removes an ending slash on the tag.
- Specified by:
- setEmptyXmlTag in interface Tag
- Parameters:
emptyXmlTag
- If true, ensures there is an ending slash in the node, i.e. <tag/>, otherwise removes it.
public void setEndTag(Tag end)
Set the end tag for this (composite) tag. For a non-composite tag this is a no-op.
- Parameters:
end
- The tag that terminates this composite tag, i.e. </HTML>.
public void setTagBegin(int tagBegin)
Sets the nodeBegin.
- Parameters:
tagBegin
- The nodeBegin to set
public void setTagEnd(int tagEnd)
Sets the nodeEnd.
- Parameters:
tagEnd
- The nodeEnd to set
public void setTagName(String name)
Set the name of this tag. This creates or replaces the first attribute of the tag (the zeroth element of the attribute vector).
- Specified by:
- setTagName in interface Tag
- Parameters:
name
- The tag name.
public void setText(String text)
Parses the given text to create the tag contents.
- Overrides:
- setText in interface AbstractNode
- Parameters:
text
- A string of the form <TAGNAME xx="yy">.
public void setThisScanner(Scanner scanner)
Set the scanner associated with this tag.
- Specified by:
- setThisScanner in interface Tag
- Parameters:
scanner
- The scanner for this tag.
public String toHtml(boolean verbatim)
Render the tag as HTML. A call to a tag'stoHtml()
method will render it in HTML.
- Overrides:
- toHtml in interface AbstractNode
- Parameters:
verbatim
- Iftrue
return as close to the original page text as possible.
- Returns:
- The tag as an HTML fragment.
- See Also:
Node.toHtml()
public String toPlainTextString()
Get the plain text from this node.
- Specified by:
- toPlainTextString in interface Node
- Overrides:
- toPlainTextString in interface AbstractNode
- Returns:
- An empty string (tag contents do not display in a browser). If you want this tags HTML equivalent, use
toHtml()
.
public String toString()
Print the contents of the tag.
- Overrides:
- toString in interface AbstractNode
- Returns:
- An string describing the tag. For text that looks like HTML use #toHtml().
© 2005 Derrick Oswald Mai 08, 2008 |
HTML Parser is an open source library released under LGPL. | |