org.htmlparser.lexer

Class StringSource

Implemented Interfaces:
Serializable

public class StringSource
extends Source

A source of characters based on a String.

Field Summary

protected String
mEncoding
The encoding to report.
protected int
mMark
The bookmark.
protected int
mOffset
The current offset into the string.
protected String
mString
The source of characters.

Fields inherited from class org.htmlparser.lexer.Source

EOF

Constructor Summary

StringSource(String string)
Construct a source using the provided string.
StringSource(String string, String character_set)
Construct a source using the provided string and encoding.

Method Summary

int
available()
Get the number of available characters.
void
close()
Does nothing.
void
destroy()
Close the source.
char
getCharacter(int offset)
Retrieve a character again.
void
getCharacters(StringBuffer buffer, int offset, int length)
Append characters already read into a StringBuffer.
void
getCharacters(char[] array, int offset, int start, int end)
Retrieve characters again.
String
getEncoding()
Get the encoding being used to convert characters.
String
getString(int offset, int length)
Retrieve a string comprised of characters already read.
void
mark(int readAheadLimit)
Mark the present position in the source.
boolean
markSupported()
Tell whether this source supports the mark() operation.
int
offset()
Get the position (in characters).
int
read()
Read a single character.
int
read(char[] cbuf)
Read characters into an array.
int
read(char[] cbuf, int off, int len)
Read characters into a portion of an array.
boolean
ready()
Tell whether this source is ready to be read.
void
reset()
Reset the source.
void
setEncoding(String character_set)
Set the encoding to the given character set.
long
skip(long n)
Skip characters.
void
unread()
Undo the read of a single character.

Methods inherited from class org.htmlparser.lexer.Source

available, close, destroy, getCharacter, getCharacters, getCharacters, getEncoding, getString, mark, markSupported, offset, read, read, read, ready, reset, setEncoding, skip, unread

Field Details

mEncoding

protected String mEncoding

mMark

protected int mMark
The bookmark.

mOffset

protected int mOffset
The current offset into the string.

mString

protected String mString
The source of characters.

Constructor Details

StringSource

public StringSource(String string)
Construct a source using the provided string. Until it is set, the encoding will be reported as ISO-8859-1.
Parameters:
string - The source of characters.

StringSource

public StringSource(String string,
                    String character_set)
Parameters:
string - The source of characters.
character_set - The encoding to report.

Method Details

available

public int available()
Get the number of available characters.
Overrides:
available in interface Source
Returns:
The number of characters that can be read or zero if the source is closed.

close

public void close()
            throws IOException
Does nothing. It's supposed to close the source, but use destroy() instead.
Overrides:
close in interface Source
See Also:
destroy()

destroy

public void destroy()
            throws IOException
Close the source. Once a source has been closed, further read, ready, mark, reset, skip, unread, getCharacter or getString invocations will throw an IOException. Closing a previously-closed source, however, has no effect.
Overrides:
destroy in interface Source

getCharacter

public char getCharacter(int offset)
            throws IOException
Retrieve a character again.
Overrides:
getCharacter in interface Source
Parameters:
offset - The offset of the character.
Returns:
The character at offset.

getCharacters

public void getCharacters(StringBuffer buffer,
                          int offset,
                          int length)
            throws IOException
Overrides:
getCharacters in interface Source
Parameters:
buffer - The buffer to append to.
offset - The offset of the first character.
length - The number of characters to retrieve.

getCharacters

public void getCharacters(char[] array,
                          int offset,
                          int start,
                          int end)
            throws IOException
Retrieve characters again.
Overrides:
getCharacters in interface Source
Parameters:
array - The array of characters.
offset - The starting position in the array where characters are to be placed.
start - The starting position, zero based.
end - The ending position (exclusive, i.e. the character at the ending position is not included), zero based.

getEncoding

public String getEncoding()
Get the encoding being used to convert characters.
Overrides:
getEncoding in interface Source
Returns:
The current encoding.

getString

public String getString(int offset,
                        int length)
            throws IOException
Overrides:
getString in interface Source
Parameters:
offset - The offset of the first character.
length - The number of characters to retrieve.
Returns:
A string containing the length characters at offset.

mark

public void mark(int readAheadLimit)
            throws IOException
Mark the present position in the source. Subsequent calls to reset() will attempt to reposition the source to this point.
Overrides:
mark in interface Source
Parameters:
readAheadLimit - Not used.

markSupported

public boolean markSupported()
Tell whether this source supports the mark() operation.
Overrides:
markSupported in interface Source
Returns:
true.

offset

public int offset()
Get the position (in characters).
Overrides:
offset in interface Source
Returns:
The number of characters that have already been read, or EOF if the source is closed.

read

public int read()
            throws IOException
Read a single character.
Overrides:
read in interface Source
Returns:
The character read, as an integer in the range 0 to 65535 (0x00-0xffff), or EOF if the source is exhausted.

read

public int read(char[] cbuf)
            throws IOException
Read characters into an array.
Overrides:
read in interface Source
Parameters:
cbuf - Destination buffer.
Returns:
The number of characters read, or EOF if the source is exhausted.

read

public int read(char[] cbuf,
                int off,
                int len)
            throws IOException
Read characters into a portion of an array.
Overrides:
read in interface Source
Parameters:
cbuf - Destination buffer
off - Offset at which to start storing characters
len - Maximum number of characters to read
Returns:
The number of characters read, or EOF if the source is exhausted.

ready

public boolean ready()
            throws IOException
Tell whether this source is ready to be read.
Overrides:
ready in interface Source
Returns:
Equivalent to a non-zero available(), i.e. there are still more characters to read.

reset

public void reset()
            throws IllegalStateException
Reset the source. Repositions the read point to begin at zero.
Overrides:
reset in interface Source

setEncoding

public void setEncoding(String character_set)
            throws ParserException
Set the encoding to the given character set. This simply sets the encoding reported by getEncoding().
Overrides:
setEncoding in interface Source
Parameters:
character_set - The character set to use to convert characters.
Throws:
ParserException - Not thrown.

skip

public long skip(long n)
            throws IOException,
                   IllegalArgumentException
Skip characters. Note: n is treated as an int
Overrides:
skip in interface Source
Parameters:
n - The number of characters to skip.
Returns:
The number of characters actually skipped

unread

public void unread()
            throws IOException
Undo the read of a single character.
Overrides:
unread in interface Source

HTML Parser is an open source library released under LGPL. SourceForge.net