Package org.apache.lucene.analysis
Class StopwordAnalyzerBase
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
ArabicAnalyzer
,ArmenianAnalyzer
,BasqueAnalyzer
,BengaliAnalyzer
,BrazilianAnalyzer
,BulgarianAnalyzer
,CatalanAnalyzer
,CJKAnalyzer
,ClassicAnalyzer
,CzechAnalyzer
,DanishAnalyzer
,EnglishAnalyzer
,EstonianAnalyzer
,FinnishAnalyzer
,FrenchAnalyzer
,GalicianAnalyzer
,GermanAnalyzer
,GreekAnalyzer
,HindiAnalyzer
,HungarianAnalyzer
,IndonesianAnalyzer
,IrishAnalyzer
,ItalianAnalyzer
,JapaneseAnalyzer
,LatvianAnalyzer
,LithuanianAnalyzer
,NepaliAnalyzer
,NorwegianAnalyzer
,PersianAnalyzer
,PolishAnalyzer
,PortugueseAnalyzer
,RomanianAnalyzer
,RussianAnalyzer
,SerbianAnalyzer
,SoraniAnalyzer
,SpanishAnalyzer
,StandardAnalyzer
,StopAnalyzer
,SwedishAnalyzer
,TamilAnalyzer
,TeluguAnalyzer
,ThaiAnalyzer
,TurkishAnalyzer
,UAX29URLEmailAnalyzer
Base class for Analyzers that need to make use of stopword sets.
- Since:
- 3.1
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
Field Summary
FieldsFields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY, storedValue
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected
Creates a new Analyzer with an empty stopword setprotected
StopwordAnalyzerBase
(CharArraySet stopwords) Creates a new instance initialized with the given stopword set -
Method Summary
Modifier and TypeMethodDescriptionReturns the analyzer's stopword set or an empty set if the analyzer has no stopwordsprotected static CharArraySet
loadStopwordSet
(boolean ignoreCase, Class<? extends Analyzer> aClass, String resource, String comment) Deprecated, for removal: This API element is subject to removal in a future version.protected static CharArraySet
loadStopwordSet
(Reader stopwords) Creates a CharArraySet from a file.protected static CharArraySet
loadStopwordSet
(Path stopwords) Creates a CharArraySet from a path.Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, createComponents, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, initReaderForNormalization, normalize, normalize, tokenStream, tokenStream
-
Field Details
-
stopwords
An immutable stopword set
-
-
Constructor Details
-
StopwordAnalyzerBase
Creates a new instance initialized with the given stopword set- Parameters:
stopwords
- the analyzer's stopword set
-
StopwordAnalyzerBase
protected StopwordAnalyzerBase()Creates a new Analyzer with an empty stopword set
-
-
Method Details
-
getStopwordSet
Returns the analyzer's stopword set or an empty set if the analyzer has no stopwords- Returns:
- the analyzer's stopword set or an empty set if the analyzer has no stopwords
-
loadStopwordSet
@Deprecated(forRemoval=true, since="9.1") protected static CharArraySet loadStopwordSet(boolean ignoreCase, Class<? extends Analyzer> aClass, String resource, String comment) throws IOException Deprecated, for removal: This API element is subject to removal in a future version.Class.getResourceAsStream(String)
is caller sensitive and cannot load resources across Java Modules. Please call thegetResourceAsStream()
andWordlistLoader.getWordSet(Reader, String, CharArraySet)
or other methods directly.Creates a CharArraySet from a file resource associated with a class. (SeeClass.getResourceAsStream(String)
).- Parameters:
ignoreCase
-true
if the set should ignore the case of the stopwords, otherwisefalse
aClass
- a class that is associated with the given stopwordResourceresource
- name of the resource file associated with the given classcomment
- comment string to ignore in the stopword file- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
Creates a CharArraySet from a path.- Parameters:
stopwords
- the stopwords file to load- Returns:
- a CharArraySet containing the distinct stopwords from the given file
- Throws:
IOException
- if loading the stopwords throws anIOException
-
loadStopwordSet
Creates a CharArraySet from a file.- Parameters:
stopwords
- the stopwords reader to load- Returns:
- a CharArraySet containing the distinct stopwords from the given reader
- Throws:
IOException
- if loading the stopwords throws anIOException
-
Class.getResourceAsStream(String)
is caller sensitive and cannot load resources across Java Modules.