Class UniformSplitTerms
- java.lang.Object
-
- org.apache.lucene.index.Terms
-
- org.apache.lucene.codecs.uniformsplit.UniformSplitTerms
-
- All Implemented Interfaces:
Accountable
- Direct Known Subclasses:
STUniformSplitTerms
public class UniformSplitTerms extends Terms implements Accountable
Terms
based on the Uniform Split technique.The
index dictionary
is lazy loaded only whenTermsEnum.seekCeil(org.apache.lucene.util.BytesRef)
orTermsEnum.seekExact(org.apache.lucene.util.BytesRef)
are called (it is not loaded for a direct terms enumeration).- See Also:
UniformSplitTermsWriter
-
-
Field Summary
Fields Modifier and Type Field Description private static long
BASE_RAM_USAGE
protected BlockDecoder
blockDecoder
protected IndexInput
blockInput
protected IndexDictionary.BrowserSupplier
dictionaryBrowserSupplier
protected FieldMetadata
fieldMetadata
protected PostingsReaderBase
postingsReader
-
Fields inherited from class org.apache.lucene.index.Terms
EMPTY_ARRAY
-
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
UniformSplitTerms(IndexInput blockInput, FieldMetadata fieldMetadata, PostingsReaderBase postingsReader, BlockDecoder blockDecoder, IndexDictionary.BrowserSupplier dictionaryBrowserSupplier)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
checkIntersectAutomatonType(CompiledAutomaton automaton)
long
getDictionaryRamBytesUsed()
int
getDocCount()
Returns the number of documents that have at least one term for this field.BytesRef
getMax()
Returns the largest term (in lexicographic order) in the field.long
getSumDocFreq()
Returns the sum ofTermsEnum.docFreq()
for all terms in this field.long
getSumTotalTermFreq()
Returns the sum ofTermsEnum.totalTermFreq()
for all terms in this field.boolean
hasFreqs()
Returns true if documents in this field store per-document term frequency (PostingsEnum.freq()
).boolean
hasOffsets()
Returns true if documents in this field store offsets.boolean
hasPayloads()
Returns true if documents in this field store payloads.boolean
hasPositions()
Returns true if documents in this field store positions.TermsEnum
intersect(CompiledAutomaton compiled, BytesRef startTerm)
Returns a TermsEnum that iterates over all terms and documents that are accepted by the providedCompiledAutomaton
.TermsEnum
iterator()
Returns an iterator that will step through all terms.long
ramBytesUsed()
Return the memory usage of this object in bytes.long
ramBytesUsedWithoutDictionary()
long
size()
Returns the number of terms for this field, or -1 if this measure isn't stored by the codec.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
-
-
-
Field Detail
-
BASE_RAM_USAGE
private static final long BASE_RAM_USAGE
-
blockInput
protected final IndexInput blockInput
-
fieldMetadata
protected final FieldMetadata fieldMetadata
-
postingsReader
protected final PostingsReaderBase postingsReader
-
blockDecoder
protected final BlockDecoder blockDecoder
-
dictionaryBrowserSupplier
protected final IndexDictionary.BrowserSupplier dictionaryBrowserSupplier
-
-
Constructor Detail
-
UniformSplitTerms
protected UniformSplitTerms(IndexInput blockInput, FieldMetadata fieldMetadata, PostingsReaderBase postingsReader, BlockDecoder blockDecoder, IndexDictionary.BrowserSupplier dictionaryBrowserSupplier)
- Parameters:
blockDecoder
- Optional block decoder, may be null if none. It can be used for decompression or decryption.
-
-
Method Detail
-
iterator
public TermsEnum iterator() throws java.io.IOException
Description copied from class:Terms
Returns an iterator that will step through all terms. This method will not return null.
-
intersect
public TermsEnum intersect(CompiledAutomaton compiled, BytesRef startTerm) throws java.io.IOException
Description copied from class:Terms
Returns a TermsEnum that iterates over all terms and documents that are accepted by the providedCompiledAutomaton
. If thestartTerm
is provided then the returned enum will only return terms> startTerm
, but you still must call next() first to get to the first term. Note that the providedstartTerm
must be accepted by the automaton.This is an expert low-level API and will only work for
NORMAL
compiled automata. To handle any compiled automata you should instead useCompiledAutomaton.getTermsEnum(org.apache.lucene.index.Terms)
instead.NOTE: the returned TermsEnum cannot seek
.
-
checkIntersectAutomatonType
protected void checkIntersectAutomatonType(CompiledAutomaton automaton)
-
getMax
public BytesRef getMax()
Description copied from class:Terms
Returns the largest term (in lexicographic order) in the field. Note that, just like other term measures, this measure does not take deleted documents into account. This returns null when there are no terms.
-
size
public long size()
Description copied from class:Terms
Returns the number of terms for this field, or -1 if this measure isn't stored by the codec. Note that, just like other term measures, this measure does not take deleted documents into account.
-
getSumTotalTermFreq
public long getSumTotalTermFreq()
Description copied from class:Terms
Returns the sum ofTermsEnum.totalTermFreq()
for all terms in this field. Note that, just like other term measures, this measure does not take deleted documents into account.- Specified by:
getSumTotalTermFreq
in classTerms
-
getSumDocFreq
public long getSumDocFreq()
Description copied from class:Terms
Returns the sum ofTermsEnum.docFreq()
for all terms in this field. Note that, just like other term measures, this measure does not take deleted documents into account.- Specified by:
getSumDocFreq
in classTerms
-
getDocCount
public int getDocCount()
Description copied from class:Terms
Returns the number of documents that have at least one term for this field. Note that, just like other term measures, this measure does not take deleted documents into account.- Specified by:
getDocCount
in classTerms
-
hasFreqs
public boolean hasFreqs()
Description copied from class:Terms
Returns true if documents in this field store per-document term frequency (PostingsEnum.freq()
).
-
hasOffsets
public boolean hasOffsets()
Description copied from class:Terms
Returns true if documents in this field store offsets.- Specified by:
hasOffsets
in classTerms
-
hasPositions
public boolean hasPositions()
Description copied from class:Terms
Returns true if documents in this field store positions.- Specified by:
hasPositions
in classTerms
-
hasPayloads
public boolean hasPayloads()
Description copied from class:Terms
Returns true if documents in this field store payloads.- Specified by:
hasPayloads
in classTerms
-
ramBytesUsed
public long ramBytesUsed()
Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.- Specified by:
ramBytesUsed
in interfaceAccountable
-
ramBytesUsedWithoutDictionary
public long ramBytesUsedWithoutDictionary()
-
getDictionaryRamBytesUsed
public long getDictionaryRamBytesUsed()
-
-