org.htmlparser.lexerapplications.thumbelina

Class Thumbelina

Implemented Interfaces:
ChangeListener, ItemListener, ListSelectionListener, Runnable

public class Thumbelina
extends JPanel
implements Runnable, ItemListener, ChangeListener, ListSelectionListener

View images behind thumbnails.

Field Summary

protected static URL[][]
NONE
Value returned when no links are discovered.
static String
PROP_CURRENT_URL_PROPERTY
Property name for current URL binding.
static String
PROP_URL_QUEUE_PROPERTY
Property name for queue size binding.
static String
PROP_URL_VISITED_PROPERTY
Property name for visited URL size binding.
protected boolean
mActive
Activity state.
protected JCheckBox
mBackgroundToggle
Background thread checkbox in status bar.
protected String
mCurrentURL
The URL being currently being examined.
protected boolean
mDiscardCGI
If true, does not follow links containing cgi calls.
protected boolean
mDiscardQueries
If true, does not follow links containing queries (?).
protected JList
mHistory
History list.
protected JScrollPane
mHistoryScroller
Scroller for the history list.
protected JSplitPane
mMainArea
Main panel in central area.
protected PicturePanel
mPicturePanel
The central area for pictures.
protected JScrollPane
mPicturePanelScroller
Scroller for the picture panel.
protected JPanel
mPowerBar
Status bar.
protected PropertyChangeSupport
mPropertySupport
Bound property support.
protected JProgressBar
mQueueProgress
Image request queue monitor in status bar.
protected JLabel
mQueueSize
URL queue size display in status bar.
protected JProgressBar
mReadyProgress
Image ready queue monitor in status bar.
protected HashMap
mRequested
Images requested.
protected JCheckBox
mRunToggle
Sequencer thread toggle in status bar.
protected Sequencer
mSequencer
The picture sequencer.
protected JSlider
mSpeedSlider
Sequencer speed slider in status bar.
protected Thread
mThread
Background thread.
protected HashMap
mTracked
Images being tracked currently.
protected JTextField
mUrlText
URL report in status bar.
protected HashMap
mVisited
URL's visited.
protected JLabel
mVisitedSize
URL visited count display in status bar.

Constructor Summary

Thumbelina()
Creates a new instance of Thumbelina.
Thumbelina(String url)
Creates a new instance of Thumbelina.
Thumbelina(URL url)
Creates a new instance of Thumbelina.

Method Summary

void
addHistory(String url)
Adds the given url to the history list.
void
addPropertyChangeListener(PropertyChangeListener listener)
Add a PropertyChangeListener to the listener list.
void
append(ArrayList list)
Append the given URLs to the queue.
void
append(URL url)
Append the given URL to the queue.
protected URL[][]
extractImageLinks(Lexer lexer, URL docbase)
Get the links of an element of a document.
protected void
fetch(URL[] images)
Fetch images.
protected ArrayList
filter(URL[] urls)
Filter URLs and add to queue.
boolean
getBackgroundThreadActive()
Gets the state of the background thread.
String
getCurrentURL()
Return the URL currently being examined.
boolean
getHistoryListVisible()
Gets the state of history list visibility.
protected URL[][]
getImageLinks(URL url)
Get the image links from the current URL.
PicturePanel
getPicturePanel()
Get the picture panel object encapsulated by this Thumbelina.
ArrayList
getQueue()
Getter for property queue.
int
getQueueSize()
Getter for property queue.
boolean
getSequencerActive()
Gets the state of the sequencer thread.
int
getSpeed()
Get the sequencer delay time.
boolean
getStatusBarVisible()
Gets the state of status bar visibility.
protected static void
help()
Provide command line help.
boolean
isDiscardCGI()
Getter for property discardCGI.
boolean
isDiscardQueries()
Getter for property discardQueries.
protected boolean
isImage(String url)
Check if the url looks like an image.
void
itemStateChanged(ItemEvent event)
Handle checkbox events from the status bar.
static void
main(String[] args)
Mainline.
protected void
memCheck()
Check for low memory situation.
void
open(String ref)
Open a URL.
void
removePropertyChangeListener(PropertyChangeListener listener)
Remove a PropertyChangeListener from the listener list.
void
reset()
Reset this Thumbelina.
void
run()
The main processing loop.
void
setBackgroundThreadActive(boolean active)
Sets the state of the background thread activity.
protected void
setCurrentURL(String url)
Set the current URL being examined.
void
setDiscardCGI(boolean discard)
Setter for property discardCGI.
void
setDiscardQueries(boolean discard)
Setter for property discardQueries.
void
setHistoryListVisible(boolean visible)
Sets the history list visibility.
void
setSequencerActive(boolean active)
Sets the sequencer activity state.
void
setSpeed(int speed)
Set the sequencer delay time.
void
setStatusBarVisible(boolean visible)
Sets the status bar visibility.
void
stateChanged(ChangeEvent event)
Handles the speed slider events.
protected void
updateQueueSize(int original, int current)
Apply a change in 'to be examined' URL list size.
protected void
updateVisitedSize(int original, int current)
Apply a change in 'visited' URL list size.
void
valueChanged(ListSelectionEvent event)
Handles the history list events.

Field Details

NONE

protected static final URL[][] NONE
Value returned when no links are discovered.

PROP_CURRENT_URL_PROPERTY

public static final String PROP_CURRENT_URL_PROPERTY
Property name for current URL binding.

PROP_URL_QUEUE_PROPERTY

public static final String PROP_URL_QUEUE_PROPERTY
Property name for queue size binding.

PROP_URL_VISITED_PROPERTY

public static final String PROP_URL_VISITED_PROPERTY
Property name for visited URL size binding.

mActive

protected boolean mActive
Activity state. true means processing URLS, false not.

mBackgroundToggle

protected JCheckBox mBackgroundToggle
Background thread checkbox in status bar.

mCurrentURL

protected String mCurrentURL
The URL being currently being examined.

mDiscardCGI

protected boolean mDiscardCGI
If true, does not follow links containing cgi calls.

mDiscardQueries

protected boolean mDiscardQueries
If true, does not follow links containing queries (?).

mHistory

protected JList mHistory
History list.

mHistoryScroller

protected JScrollPane mHistoryScroller
Scroller for the history list.

mMainArea

protected JSplitPane mMainArea
Main panel in central area.

mPicturePanel

protected PicturePanel mPicturePanel
The central area for pictures.

mPicturePanelScroller

protected JScrollPane mPicturePanelScroller
Scroller for the picture panel.

mPowerBar

protected JPanel mPowerBar
Status bar.

mPropertySupport

protected PropertyChangeSupport mPropertySupport
Bound property support.

mQueueProgress

protected JProgressBar mQueueProgress
Image request queue monitor in status bar.

mQueueSize

protected JLabel mQueueSize
URL queue size display in status bar.

mReadyProgress

protected JProgressBar mReadyProgress
Image ready queue monitor in status bar.

mRequested

protected HashMap mRequested
Images requested.

mRunToggle

protected JCheckBox mRunToggle
Sequencer thread toggle in status bar.

mSequencer

protected Sequencer mSequencer
The picture sequencer.

mSpeedSlider

protected JSlider mSpeedSlider
Sequencer speed slider in status bar.

mThread

protected Thread mThread
Background thread.

mTracked

protected HashMap mTracked
Images being tracked currently.

mUrlText

protected JTextField mUrlText
URL report in status bar.

mVisited

protected HashMap mVisited
URL's visited.

mVisitedSize

protected JLabel mVisitedSize
URL visited count display in status bar.

Constructor Details

Thumbelina

public Thumbelina()
Creates a new instance of Thumbelina.

Thumbelina

public Thumbelina(String url)
            throws MalformedURLException
Creates a new instance of Thumbelina.
Parameters:
url - Single URL to enter into the 'to follow' list.

Thumbelina

public Thumbelina(URL url)
Creates a new instance of Thumbelina.
Parameters:
url - URL to enter into the 'to follow' list.

Method Details

addHistory

public void addHistory(String url)
Adds the given url to the history list. Also puts the URL in the url text of the status bar.
Parameters:
url - The URL to add to the history list.

addPropertyChangeListener

public void addPropertyChangeListener(PropertyChangeListener listener)
Add a PropertyChangeListener to the listener list. The listener is registered for all properties.
Parameters:
listener - The PropertyChangeListener to be added.

append

public void append(ArrayList list)
Append the given URLs to the queue.
Parameters:
list - The list of URL objects to add.

append

public void append(URL url)
Append the given URL to the queue. Adds the url only if it isn't already in the queue, and notifys listeners about the addition.
Parameters:
url - The url to add.

extractImageLinks

protected URL[][] extractImageLinks(Lexer lexer,
                                    URL docbase)
            throws IOException,
                   ParserException
Get the links of an element of a document. Only gets the links on IMG elements that reference another image. The latter is based on suffix (.jpg, .gif and .png).
Parameters:
lexer - The fully conditioned lexer, ready to rock.
docbase - The url to read.
Returns:
The URLs, targets of the IMG links;
Throws:
ParserException - If there is a problem parsing the url.

fetch

protected void fetch(URL[] images)
Fetch images. Ask the toolkit to make the image from a URL, and add a tracker to handle it when it's received. Add details to the rquested and tracked lists and update the status bar.
Parameters:
images - The list of images to fetch.

filter

protected ArrayList filter(URL[] urls)
Filter URLs and add to queue. Removes already visited links and appends the rest (if any) to the visit pending list.
Parameters:
urls - The list of URL's to add to the 'to visit' list.
Returns:
Returns the filered list.

getBackgroundThreadActive

public boolean getBackgroundThreadActive()
Gets the state of the background thread.
Returns:
true if the thread is examining web pages.

getCurrentURL

public String getCurrentURL()
Return the URL currently being examined. This is a bound property. Notifications are available via the PROP_CURRENT_URL_PROPERTY property.
Returns:
The size of the 'to be examined' list.

getHistoryListVisible

public boolean getHistoryListVisible()
Gets the state of history list visibility.
Returns:
true if the history list is visible.

getImageLinks

protected URL[][] getImageLinks(URL url)
Get the image links from the current URL.
Parameters:
url - The URL to get the links from
Returns:
An array of two URL arrays, index 0 is a list of images, index 1 is a list of links to possibly follow.

getPicturePanel

public PicturePanel getPicturePanel()
Get the picture panel object encapsulated by this Thumbelina.
Returns:
The picture panel.

getQueue

public ArrayList getQueue()
Getter for property queue.
Returns:
List of URLs that are to be visited.

getQueueSize

public int getQueueSize()
Getter for property queue. This is a bound property. Notifications are available via the PROP_URL_QUEUE_PROPERTY property.
Returns:
The size of the list of URLs that are to be visited.

getSequencerActive

public boolean getSequencerActive()
Gets the state of the sequencer thread.
Returns:
true if the thread is pumping images.

getSpeed

public int getSpeed()
Get the sequencer delay time.
Returns:
The number of milliseconds between image additions to the panel.

getStatusBarVisible

public boolean getStatusBarVisible()
Gets the state of status bar visibility.
Returns:
true if the status bar is visible.

help

protected static void help()
Provide command line help.

isDiscardCGI

public boolean isDiscardCGI()
Getter for property discardCGI.
Returns:
Value of property discardCGI.

isDiscardQueries

public boolean isDiscardQueries()
Getter for property discardQueries.
Returns:
Value of property discardQueries.

isImage

protected boolean isImage(String url)
Check if the url looks like an image.
Parameters:
url - The usrl to check for image characteristics.
Returns:
true if the url ends in a recognized image extension.

itemStateChanged

public void itemStateChanged(ItemEvent event)
Handle checkbox events from the status bar. Based on the thread toggles, activates or deactivates the background thread processes.
Parameters:
event - The event describing the checkbox event.

main

public static void main(String[] args)
Mainline.
Parameters:
args - the command line arguments. Can be one or more forms of -help to get command line help, or a URL to prime the program with. Checks for JDK 1.4 and if not found runs in crippled mode (no ThumbelinaFrame).

memCheck

protected void memCheck()
Check for low memory situation. Report to the user a bad situation.

open

public void open(String ref)
Open a URL. Resets the urls list and appends the given url as the only item.
Parameters:
ref - The URL to add.

removePropertyChangeListener

public void removePropertyChangeListener(PropertyChangeListener listener)
Remove a PropertyChangeListener from the listener list. This removes a PropertyChangeListener that was registered for all properties.
Parameters:
listener - The PropertyChangeListener to be removed.

reset

public void reset()
Reset this Thumbelina. Clears the sequencer of pending images, resets the picture panel, emptiies the 'to be examined' list of URLs.

run

public void run()
The main processing loop. Pull suspect URLs off the queue one at a time, fetch and parse it, request images and enqueue further links.

setBackgroundThreadActive

public void setBackgroundThreadActive(boolean active)
Sets the state of the background thread activity. The background thread is responsible for examining URLs that are on the queue for thumbnails, and starting the image fetch operation.
Parameters:
active - If true, the background thread will be turned on.

setCurrentURL

protected void setCurrentURL(String url)
Set the current URL being examined.
Parameters:
url - The url that is being examined.

setDiscardCGI

public void setDiscardCGI(boolean discard)
Setter for property discardCGI.
Parameters:
discard - New value of property discardCGI.

setDiscardQueries

public void setDiscardQueries(boolean discard)
Setter for property discardQueries.
Parameters:
discard - New value of property discardQueries.

setHistoryListVisible

public void setHistoryListVisible(boolean visible)
Sets the history list visibility.
Parameters:
visible - The new visibility state. If true, the history list will be unhidden.

setSequencerActive

public void setSequencerActive(boolean active)
Sets the sequencer activity state. The sequencer is the thread that moves images from the pending list to the picture panel on a timed basis.
Parameters:
active - The new activity state. If true, the sequencer will be turned on. This may alter the speed setting if it is set to zero.

setSpeed

public void setSpeed(int speed)
Set the sequencer delay time. The sequencer is the thread that moves images from the pending list to the picture panel on a timed basis. This value sets the number of milliseconds it waits between pictures. Setting it to zero toggles the running state off.
Parameters:
speed - The sequencer delay in milliseconds.

setStatusBarVisible

public void setStatusBarVisible(boolean visible)
Sets the status bar visibility.
Parameters:
visible - The new visibility state. If true, the status bar will be unhidden.

stateChanged

public void stateChanged(ChangeEvent event)
Handles the speed slider events.
Parameters:
event - The event describing the slider activity.

updateQueueSize

protected void updateQueueSize(int original,
                               int current)
Apply a change in 'to be examined' URL list size. Sends notification via the PROP_URL_QUEUE_PROPERTY property and updates the status bar.
Parameters:
original - The original size of the list.
current - The new size of the list.

updateVisitedSize

protected void updateVisitedSize(int original,
                                 int current)
Apply a change in 'visited' URL list size. Sends notification via the PROP_URL_VISITED_PROPERTY property and updates the status bar.
Parameters:
original - The original size of the list.
current - The new size of the list.

valueChanged

public void valueChanged(ListSelectionEvent event)
Handles the history list events.
Parameters:
event - The event describing the list activity.

HTML Parser is an open source library released under LGPL. SourceForge.net