Package org.apache.lucene.codecs.memory
Class DirectPostingsFormat
java.lang.Object
org.apache.lucene.codecs.PostingsFormat
org.apache.lucene.codecs.memory.DirectPostingsFormat
- All Implemented Interfaces:
NamedSPILoader.NamedSPI
Wraps
Lucene90PostingsFormat
format for on-disk storage, but then at read time loads and
stores all terms and postings directly in RAM as byte[], int[].
WARNING: This is exceptionally RAM intensive: it makes no effort to compress the postings data, storing terms as separate byte[] and postings as separate int[], but as a result it gives substantial increase in search performance.
This postings format supports TermsEnum.ord()
and TermsEnum.seekExact(long)
.
Because this holds all term bytes as a single byte[], you cannot have more than 2.1GB worth of term bytes in a single segment.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprivate static final class
private static final class
private static final class
private static final class
private static final class
private static final class
private static final class
private static final class
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final int
private static final int
private final int
private final int
Fields inherited from class org.apache.lucene.codecs.PostingsFormat
EMPTY
-
Constructor Summary
ConstructorsConstructorDescriptionDirectPostingsFormat
(int minSkipCount, int lowFreqCutoff) minSkipCount is how many terms in a row must have the same prefix before we put a skip pointer down. -
Method Summary
Modifier and TypeMethodDescriptionfieldsConsumer
(SegmentWriteState state) Writes a new segmentfieldsProducer
(SegmentReadState state) Reads a segment.Methods inherited from class org.apache.lucene.codecs.PostingsFormat
availablePostingsFormats, forName, getName, reloadPostingsFormats, toString
-
Field Details
-
minSkipCount
private final int minSkipCount -
lowFreqCutoff
private final int lowFreqCutoff -
DEFAULT_MIN_SKIP_COUNT
private static final int DEFAULT_MIN_SKIP_COUNT- See Also:
-
DEFAULT_LOW_FREQ_CUTOFF
private static final int DEFAULT_LOW_FREQ_CUTOFF- See Also:
-
-
Constructor Details
-
DirectPostingsFormat
public DirectPostingsFormat() -
DirectPostingsFormat
public DirectPostingsFormat(int minSkipCount, int lowFreqCutoff) minSkipCount is how many terms in a row must have the same prefix before we put a skip pointer down. Terms with docFreq <= lowFreqCutoff will use a single int[] to hold all docs, freqs, position and offsets; terms with higher docFreq will use separate arrays.
-
-
Method Details
-
fieldsConsumer
Description copied from class:PostingsFormat
Writes a new segment- Specified by:
fieldsConsumer
in classPostingsFormat
- Throws:
IOException
-
fieldsProducer
Description copied from class:PostingsFormat
Reads a segment. NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.- Specified by:
fieldsProducer
in classPostingsFormat
- Throws:
IOException
-