Class TermsHashPerField

  • All Implemented Interfaces:
    java.lang.Comparable<TermsHashPerField>
    Direct Known Subclasses:
    FreqProxTermsWriterPerField, TermVectorsConsumerPerField

    abstract class TermsHashPerField
    extends java.lang.Object
    implements java.lang.Comparable<TermsHashPerField>
    This class stores streams of information per term without knowing the size of the stream ahead of time. Each stream typically encodes one level of information like term frequency per document or term proximity. Internally this class allocates a linked list of slices that can be read by a ByteSliceReader for each term. Terms are first deduplicated in a BytesRefHash once this is done internal data-structures point to the current offset of each stream that can be written to.
    • Field Detail

      • termStreamAddressBuffer

        private int[] termStreamAddressBuffer
      • streamAddressOffset

        private int streamAddressOffset
      • streamCount

        private final int streamCount
      • fieldName

        private final java.lang.String fieldName
      • lastDocID

        private int lastDocID
      • sortedTermIDs

        private int[] sortedTermIDs
      • doNextCall

        private boolean doNextCall
    • Method Detail

      • reset

        void reset()
      • initReader

        final void initReader​(ByteSliceReader reader,
                              int termID,
                              int stream)
      • sortTerms

        final void sortTerms()
        Collapse the hash table and sort in-place; also sets this.sortedTermIDs to the results This method must not be called twice unless reset() or reinitHash() was called.
      • getSortedTermIDs

        final int[] getSortedTermIDs()
        Returns the sorted term IDs. sortTerms() must be called before
      • reinitHash

        final void reinitHash()
      • add

        private void add​(int textStart,
                         int docID)
                  throws java.io.IOException
        Throws:
        java.io.IOException
      • initStreamSlices

        private void initStreamSlices​(int termID,
                                      int docID)
                               throws java.io.IOException
        Throws:
        java.io.IOException
      • assertDocId

        private boolean assertDocId​(int docId)
      • add

        void add​(BytesRef termBytes,
                 int docID)
          throws java.io.IOException
        Called once per inverted token. This is the primary entry point (for first TermsHash); postings use this API.
        Throws:
        java.io.IOException
      • positionStreamSlice

        private int positionStreamSlice​(int termID,
                                        int docID)
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • writeByte

        final void writeByte​(int stream,
                             byte b)
      • writeBytes

        final void writeBytes​(int stream,
                              byte[] b,
                              int offset,
                              int len)
      • writeVInt

        final void writeVInt​(int stream,
                             int i)
      • getFieldName

        final java.lang.String getFieldName()
      • finish

        void finish()
             throws java.io.IOException
        Finish adding all instances of this field to the current document.
        Throws:
        java.io.IOException
      • getNumTerms

        final int getNumTerms()
      • start

        boolean start​(IndexableField field,
                      boolean first)
        Start adding a new field instance; first is true if this is the first time this field name was seen in the document.
      • newTerm

        abstract void newTerm​(int termID,
                              int docID)
                       throws java.io.IOException
        Called when a term is seen for the first time.
        Throws:
        java.io.IOException
      • addTerm

        abstract void addTerm​(int termID,
                              int docID)
                       throws java.io.IOException
        Called when a previously seen term is seen again.
        Throws:
        java.io.IOException
      • newPostingsArray

        abstract void newPostingsArray()
        Called when the postings array is initialized or resized.
      • createPostingsArray

        abstract ParallelPostingsArray createPostingsArray​(int size)
        Creates a new postings array of the specified size.