@Deprecated public final class CJKTokenizer extends org.apache.lucene.analysis.Tokenizer
The tokens returned are every two adjacent characters with overlap match.
Example: "java C1C2C3C4" will be segmented to: "java" "C1C2" "C2C3" "C3C4".
Additionally, the following is applied to Latin text (such as English):| Constructor and Description |
|---|
CJKTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
Deprecated.
|
CJKTokenizer(org.apache.lucene.util.AttributeSource source,
Reader in)
Deprecated.
|
CJKTokenizer(Reader in)
Deprecated.
Construct a token stream processing the given input.
|
| Modifier and Type | Method and Description |
|---|---|
void |
end()
Deprecated.
|
boolean |
incrementToken()
Deprecated.
Returns true for the next token in the stream, or false at EOS.
|
void |
reset()
Deprecated.
|
void |
reset(Reader reader)
Deprecated.
|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toStringpublic CJKTokenizer(Reader in)
in - I/O readerpublic CJKTokenizer(org.apache.lucene.util.AttributeSource source,
Reader in)
public CJKTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
public boolean incrementToken()
throws IOException
incrementToken in class org.apache.lucene.analysis.TokenStreamIOException - - throw IOException when read error public final void end()
end in class org.apache.lucene.analysis.TokenStreampublic void reset()
throws IOException
reset in class org.apache.lucene.analysis.TokenStreamIOExceptionpublic void reset(Reader reader) throws IOException
reset in class org.apache.lucene.analysis.TokenizerIOExceptionCopyright © 2000-2015 Apache Software Foundation. All Rights Reserved.