Package org.htmlparser.filters
The filters package contains example filters to select only desired nodes.
AndFilter | Accepts nodes matching all of its predicate filters (AND operation). |
CssSelectorNodeFilter | A NodeFilter that accepts nodes based on whether they match a CSS2 selector. |
HasAttributeFilter | This class accepts all tags that have a certain attribute,
and optionally, with a certain value. |
HasChildFilter | This class accepts all tags that have a child acceptable to the filter. |
HasParentFilter | This class accepts all tags that have a parent acceptable to another filter. |
HasSiblingFilter | This class accepts all tags that have a sibling acceptable to another filter. |
IsEqualFilter | This class accepts only one specific node. |
LinkRegexFilter | This class accepts tags of class LinkTag that contain a link matching a given
regex pattern. |
LinkStringFilter | This class accepts tags of class LinkTag that contain a link matching a given
pattern string. |
NodeClassFilter | This class accepts all tags of a given class. |
NotFilter | Accepts all nodes not acceptable to it's predicate filter. |
OrFilter | Accepts nodes matching any of its predicates filters (OR operation). |
RegexFilter | This filter accepts all string nodes matching a regular expression. |
StringFilter | This class accepts all string nodes containing the given string. |
TagNameFilter | This class accepts all tags matching the tag name. |
XorFilter | Accepts nodes matching an odd number of its predicates filters (XOR operation). |
The filters package contains example filters to select only desired nodes.
For example, to display tags having the "id" attribute, you could use:
Parser parser = new Parser ("http://yadda");
parser.parse (new HasAttributeFilter ("id"));
These filters can be combined to yield powerful extraction capabilities.
For example, to get a list of links where the contents is an image, you could use:
NodeList list = new NodeList ();
NodeFilter filter =
new AndFilter (
new TagNameFilter ("A"),
new HasChildFilter (
new TagNameFilter ("IMG")));
for (NodeIterator e = parser.elements (); e.hasMoreNodes (); )
e.nextNode ().collectInto (list, filter);
| © 2005 Derrick Oswald Mai 08, 2008 |
HTML Parser is an open source library released under LGPL. |  |