Package org.apache.lucene.analysis
Class StopwordAnalyzerBase
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.analysis.ReusableAnalyzerBase
-
- org.apache.lucene.analysis.StopwordAnalyzerBase
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
ArabicAnalyzer
,ArmenianAnalyzer
,BasqueAnalyzer
,BrazilianAnalyzer
,BulgarianAnalyzer
,CatalanAnalyzer
,CJKAnalyzer
,ClassicAnalyzer
,DanishAnalyzer
,EnglishAnalyzer
,FinnishAnalyzer
,FrenchAnalyzer
,GalicianAnalyzer
,GermanAnalyzer
,GreekAnalyzer
,HindiAnalyzer
,HungarianAnalyzer
,IndonesianAnalyzer
,IrishAnalyzer
,ItalianAnalyzer
,JapaneseAnalyzer
,LatvianAnalyzer
,NorwegianAnalyzer
,PersianAnalyzer
,PolishAnalyzer
,PortugueseAnalyzer
,RomanianAnalyzer
,RussianAnalyzer
,SpanishAnalyzer
,StandardAnalyzer
,StopAnalyzer
,SwedishAnalyzer
,ThaiAnalyzer
,TurkishAnalyzer
,UAX29URLEmailAnalyzer
public abstract class StopwordAnalyzerBase extends ReusableAnalyzerBase
Base class for Analyzers that need to make use of stopword sets.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.ReusableAnalyzerBase
ReusableAnalyzerBase.TokenStreamComponents
-
-
Field Summary
Fields Modifier and Type Field Description protected Version
matchVersion
protected CharArraySet
stopwords
An immutable stopword set
-
Constructor Summary
Constructors Modifier Constructor Description protected
StopwordAnalyzerBase(Version version)
Creates a new Analyzer with an empty stopword setprotected
StopwordAnalyzerBase(Version version, Set<?> stopwords)
Creates a new instance initialized with the given stopword set
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Set<?>
getStopwordSet()
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwordsprotected static CharArraySet
loadStopwordSet(boolean ignoreCase, Class<? extends ReusableAnalyzerBase> aClass, String resource, String comment)
Creates a CharArraySet from a file resource associated with a class.protected static CharArraySet
loadStopwordSet(File stopwords, Version matchVersion)
Creates a CharArraySet from a file.protected static CharArraySet
loadStopwordSet(Reader stopwords, Version matchVersion)
Creates a CharArraySet from a file.-
Methods inherited from class org.apache.lucene.analysis.ReusableAnalyzerBase
createComponents, initReader, reusableTokenStream, tokenStream
-
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getOffsetGap, getPositionIncrementGap, getPreviousTokenStream, setPreviousTokenStream
-
-
-
-
Field Detail
-
stopwords
protected final CharArraySet stopwords
An immutable stopword set
-
matchVersion
protected final Version matchVersion
-
-
Constructor Detail
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase(Version version, Set<?> stopwords)
Creates a new instance initialized with the given stopword set- Parameters:
version
- the Lucene version for cross version compatibilitystopwords
- the analyzer's stopword set
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase(Version version)
Creates a new Analyzer with an empty stopword set- Parameters:
version
- the Lucene version for cross version compatibility
-
-
Method Detail
-
getStopwordSet
public Set<?> getStopwordSet()
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords- Returns:
- the analyzer's stopword set or an empty set if the analyzer has no stopwords
-
loadStopwordSet
protected static CharArraySet loadStopwordSet(boolean ignoreCase, Class<? extends ReusableAnalyzerBase> aClass, String resource, String comment) throws IOException
Creates a CharArraySet from a file resource associated with a class. (SeeClass.getResourceAsStream(String)
).- Parameters:
ignoreCase
-true
if the set should ignore the case of the stopwords, otherwisefalse
aClass
- a class that is associated with the given stopwordResourceresource
- name of the resource file associated with the given classcomment
- comment string to ignore in the stopword file- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
protected static CharArraySet loadStopwordSet(File stopwords, Version matchVersion) throws IOException
Creates a CharArraySet from a file.- Parameters:
stopwords
- the stopwords file to loadmatchVersion
- the Lucene version for cross version compatibility- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
protected static CharArraySet loadStopwordSet(Reader stopwords, Version matchVersion) throws IOException
Creates a CharArraySet from a file.- Parameters:
stopwords
- the stopwords reader to loadmatchVersion
- the Lucene version for cross version compatibility- Returns:
- a CharArraySet containing the distinct stopwords from the given reader
- Throws:
IOException
- if loading the stopwords throws anIOException
-
-