Uses of Class
org.apache.lucene.analysis.Tokenizer
-
Packages that use Tokenizer Package Description org.apache.lucene.analysis API and code to convert text into indexable/searchable tokens.org.apache.lucene.analysis.standard Standards-based analyzers implemented with JFlex. -
-
Uses of Tokenizer in org.apache.lucene.analysis
Subclasses of Tokenizer in org.apache.lucene.analysis Modifier and Type Class Description class
CharTokenizer
An abstract base class for simple, character-oriented tokenizers.class
KeywordTokenizer
Emits the entire input as a single token.class
LetterTokenizer
A LetterTokenizer is a tokenizer that divides text at non-letters.class
LowerCaseTokenizer
LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.class
WhitespaceTokenizer
A WhitespaceTokenizer is a tokenizer that divides text at whitespace.Fields in org.apache.lucene.analysis declared as Tokenizer Modifier and Type Field Description protected Tokenizer
ReusableAnalyzerBase.TokenStreamComponents. source
Constructors in org.apache.lucene.analysis with parameters of type Tokenizer Constructor Description TokenStreamComponents(Tokenizer source)
Creates a newReusableAnalyzerBase.TokenStreamComponents
instance.TokenStreamComponents(Tokenizer source, TokenStream result)
Creates a newReusableAnalyzerBase.TokenStreamComponents
instance. -
Uses of Tokenizer in org.apache.lucene.analysis.standard
Subclasses of Tokenizer in org.apache.lucene.analysis.standard Modifier and Type Class Description class
ClassicTokenizer
A grammar-based tokenizer constructed with JFlexclass
StandardTokenizer
A grammar-based tokenizer constructed with JFlex.class
UAX29URLEmailTokenizer
This class implements Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29 URLs and email addresses are also tokenized according to the relevant RFCs.
-