org.htmlparser.parserapplications
Class StringExtractor
public class StringExtractor
Extract plaintext strings from a web page.
Illustrative program to gather the textual contents of a web page.
Uses a
StringBean
to accumulate
the user visible text (what a browser would display) into a single string.
StringExtractor(String resource) - Construct a StringExtractor to read from the given resource.
|
String | extractStrings(boolean links) - Extract the text from a page.
|
static void | main(String[] args) - Mainline.
|
StringExtractor
public StringExtractor(String resource)
Construct a StringExtractor to read from the given resource.
resource
- Either a URL or a file name.
extractStrings
public String extractStrings(boolean links)
throws ParserException
Extract the text from a page.
links
- if true
include hyperlinks in output.
- The textual contents of the page.
main
public static void main(String[] args)
Mainline.
args
- The command line arguments.
| © 2005 Derrick Oswald Mai 08, 2008 |
HTML Parser is an open source library released under LGPL. |  |