org.htmlparser.filters

Class LinkRegexFilter

Implemented Interfaces:
Cloneable, NodeFilter, Serializable

public class LinkRegexFilter
extends Object
implements NodeFilter

This class accepts tags of class LinkTag that contain a link matching a given regex pattern. Use this filter to extract LinkTag nodes with URLs that match the desired regex pattern.

Field Summary

protected Pattern
mRegex
The regular expression to use on the link.

Constructor Summary

LinkRegexFilter(String regexPattern)
Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern.
LinkRegexFilter(String regexPattern, boolean caseSensitive)
Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern.

Method Summary

boolean
accept(Node node)
Accept nodes that are a LinkTag and have a URL that matches the regex pattern supplied in the constructor.

Field Details

mRegex

protected Pattern mRegex
The regular expression to use on the link.

Constructor Details

LinkRegexFilter

public LinkRegexFilter(String regexPattern)
Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern. The match is case insensitive.
Parameters:
regexPattern - The pattern to match.

LinkRegexFilter

public LinkRegexFilter(String regexPattern,
                       boolean caseSensitive)
Creates a LinkRegexFilter that accepts LinkTag nodes containing a URL that matches the supplied regex pattern.
Parameters:
regexPattern - The regex pattern to match.
caseSensitive - Specifies case sensitivity for the matching process.

Method Details

accept

public boolean accept(Node node)
Accept nodes that are a LinkTag and have a URL that matches the regex pattern supplied in the constructor.
Specified by:
accept in interface NodeFilter
Parameters:
node - The node to check.
Returns:
true if the node is a link with the pattern.

HTML Parser is an open source library released under LGPL. SourceForge.net