HTML Parser Home Page | |
Frames | No Frames |
Methods which throw type org.htmlparser.util.ParserException | |
void | Perform the meaning of this tag. |
Methods which throw type org.htmlparser.util.ParserException | |
void | Perform the meaning of this tag. |
void | Perform the META tag semantic action. |
Methods with parameter type org.htmlparser.util.ParserException | |
void | Error message. |
Methods which throw type org.htmlparser.util.ParserException | |
void | Process nodes recursively on the DocumentHandler. |
Constructors which throw type org.htmlparser.util.ParserException | |
Creates a Parser object with the location of the resource (URL or file). | |
Creates a Parser object with the location of the resource (URL or file)
You would typically create a DefaultHTMLParserFeedback object and pass
it in. | |
Construct a parser using the provided URLConnection. | |
Constructor for custom HTTP access. |
Methods which throw type org.htmlparser.util.ParserException | |
Remark | Create a new remark node. |
Text | Create a new text node. |
Tag | Create a new tag node. |
void | Perform the meaning of this tag. |
NodeIterator | Returns an iterator (enumeration) over the html nodes. |
NodeList | Extract all nodes matching the given filter. |
NodeList | Parse the given resource, using the filter provided. |
void | Parser.postConnect(HttpURLConnection connection) Called just after calling connect. |
void | Parser.preConnect(HttpURLConnection connection) Called just prior to calling connect. |
void | Parser.setConnection(URLConnection connection) Set the connection for this parser. |
void | Parser.setEncoding(String encoding) Set the encoding for the page this parser is reading from. |
void | Parser.setInputHTML(String inputHTML) Initializes the parser with the given input HTML String. |
void | Parser.setResource(String resource) Set the html, a url, or a file. |
void | Set the URL for this parser. |
void | Apply the given visitor to the current page. |
Methods which throw type org.htmlparser.util.ParserException | |
String | StringExtractor.extractStrings(boolean links) Extract the text from a page. |
boolean | SiteCapturer.isHtml(String link) Returns true if the link contains text/html content. |
void | Process a single page. |
Methods which throw type org.htmlparser.util.ParserException | |
String | Decode script encoded by the Microsoft obfuscator. |
Tag | Creates an end tag with the same name as the given tag. |
void | Finish off a tag. |
Tag | Collect the children. |
Tag | Scan the tag. |
Tag | Scan for script. |
Tag | Scan for style definitions. |
Tag | Scan the tag. |
Methods which throw type org.htmlparser.util.ParserException | |
NodeList | Apply each of the filters. |
URL[] | Internal routine to extract all the links from the parser. |
String | Extract the text from a page. |
Classes derived from org.htmlparser.util.ParserException | |
class | The encoding is changed invalidating already scanned characters. |
Methods with parameter type org.htmlparser.util.ParserException | |
void | Print an error message. |
void | |
void |
Methods which throw type org.htmlparser.util.ParserException | |
Parser | ParserUtils.createParserParsingAnInputString(String input) Create a Parser Object having a String Object as input (instead of a url or a string representing the url location). |
boolean | Check if more nodes are available. |
boolean | Check if more nodes are available. |
Node | Get the next node. |
Node | Get the next node. |
String[] | ParserUtils.splitTags(String input, Class nodeType) Split the input string in a string array,
considering the tags as delimiter for splitting. |
String[] | ParserUtils.splitTags(String input, Class nodeType, boolean recursive, boolean insideTag) Split the input string in a string array,
considering the tags as delimiter for splitting. |
String[] | ParserUtils.splitTags(String input, String[] tags) Split the input string in a string array,
considering the tags as delimiter for splitting. |
String[] | ParserUtils.splitTags(String input, String[] tags, boolean recursive, boolean insideTag) Split the input string in a string array,
considering the tags as delimiter for splitting. |
String[] | Split the input string in a string array,
considering the tags as delimiter for splitting. |
String[] | Split the input string in a string array,
considering the tags as delimiter for splitting. |
String | ParserUtils.trimTags(String input, Class nodeType) Trim all tags in the input string and
return a string like the input one
without the tags and their content. |
String | ParserUtils.trimTags(String input, Class nodeType, boolean recursive, boolean insideTag) Trim all tags in the input string and
return a string like the input one
without the tags and their content (optional). |
String | ParserUtils.trimTags(String input, String[] tags) Trim all tags in the input string and
return a string like the input one
without the tags and their content. |
String | ParserUtils.trimTags(String input, String[] tags, boolean recursive, boolean insideTag) Trim all tags in the input string and
return a string like the input one
without the tags and their content (optional). |
String | Trim all tags in the input string and
return a string like the input one
without the tags and their content. |
String | Trim all tags in the input string and
return a string like the input one
without the tags and their content (optional). |
void | Utility to apply a visitor to a node list. |
Methods which throw type org.htmlparser.util.ParserException | |
URL[][] | Get the links of an element of a document. |
Methods which throw type org.htmlparser.util.ParserException | |
URLConnection | ConnectionManager.openConnection(String string) Opens a connection based on a given string. |
URLConnection | ConnectionManager.openConnection(URL url) Opens a connection using the given url. |
void | ConnectionMonitor.postConnect(HttpURLConnection connection) Called just after calling connect. |
void | ConnectionMonitor.preConnect(HttpURLConnection connection) Called just prior to calling connect. |
Constructors which throw type org.htmlparser.util.ParserException | |
Creates a new instance of a Lexer. | |
Construct a page reading from a URL connection. |
Methods which throw type org.htmlparser.util.ParserException | |
char | Read the character at the given cursor position. |
void | Mainline for command line operation
|
Node | Lexer.makeRemark(int start, int end) Create a remark node based on the current cursor and the one provided. |
Node | Lexer.makeString(int start, int end) Create a string node based on the current cursor and the one provided. |
Node | Create a tag node based on the current cursor and the one provided. |
Node | Get the next node from the source. |
Node | Get the next node from the source. |
Node | Return CDATA as a text node. |
Node | Lexer.parseCDATA(boolean quotesmart) Return CDATA as a text node. |
Node | Parse a java server page node. |
Node | Parse an XML processing instruction. |
Node | Lexer.parseRemark(int start, boolean quotesmart) Parse a comment. |
Node | Lexer.parseString(int start, boolean quotesmart) Parse a string node. |
Node | Parse a tag. |
void | Advance the cursor through a JIS escape sequence. |
void | Page.setConnection(URLConnection connection) Set the URLConnection to be used by this page. |
void | InputStreamSource.setEncoding(String character_set) Begins reading from the source with the given character set. |
void | Page.setEncoding(String character_set) Begins reading from the source with the given character set. |
void | Source.setEncoding(String character_set) Set the encoding to the given character set. |
void | StringSource.setEncoding(String character_set) Set the encoding to the given character set. |
void | Return a character. |
© 2005 Derrick Oswald Mai 08, 2008 |
HTML Parser is an open source library released under LGPL. | |