Class WANDScorer


  • final class WANDScorer
    extends Scorer
    This implements the WAND (Weak AND) algorithm for dynamic pruning described in "Efficient Query Evaluation using a Two-Level Retrieval Process" by Broder, Carmel, Herscovici, Soffer and Zien. Enhanced with techniques described in "Faster Top-k Document Retrieval Using Block-Max Indexes" by Ding and Suel. This scorer maintains a feedback loop with the collector in order to know at any time the minimum score that is required in order for a hit to be competitive. Then it leverages the max score from each scorer in order to know when it may call DocIdSetIterator.advance(int) rather than DocIdSetIterator.nextDoc() to move to the next competitive hit. Implementation is similar to MinShouldMatchSumScorer except that instead of enforcing that freq >= minShouldMatch, we enforce that ∑ max_score >= minCompetitiveScore.
    • Field Detail

      • scalingFactor

        private final int scalingFactor
      • minCompetitiveScore

        private long minCompetitiveScore
      • doc

        int doc
      • leadMaxScore

        long leadMaxScore
      • tailMaxScore

        long tailMaxScore
      • tailSize

        int tailSize
      • cost

        final long cost
      • upTo

        int upTo
      • minShouldMatch

        final int minShouldMatch
      • freq

        int freq
    • Constructor Detail

      • WANDScorer

        WANDScorer​(Weight weight,
                   java.util.Collection<Scorer> scorers,
                   int minShouldMatch)
            throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • scalingFactor

        static int scalingFactor​(float f)
        Return a scaling factor for the given float so that f x 2^scalingFactor would be in ]2^15, 2^16]. Special cases: scalingFactor(0) = scalingFactor(MIN_VALUE) - 1 scalingFactor(+Infty) = scalingFactor(MAX_VALUE) + 1
      • scaleMaxScore

        private static long scaleMaxScore​(float maxScore,
                                          int scalingFactor)
        Scale max scores in an unsigned integer to avoid overflows (only the lower 32 bits of the long are used) as well as floating-point arithmetic errors. Those are rounded up in order to make sure we do not miss any matches.
      • scaleMinScore

        private static long scaleMinScore​(float minScore,
                                          int scalingFactor)
        Scale min competitive scores the same way as max scores but this time by rounding down in order to make sure that we do not miss any matches.
      • ensureConsistent

        private boolean ensureConsistent()
      • setMinCompetitiveScore

        public void setMinCompetitiveScore​(float minScore)
                                    throws java.io.IOException
        Description copied from class: Scorable
        Optional method: Tell the scorer that its iterator may safely ignore all documents whose score is less than the given minScore. This is a no-op by default. This method may only be called from collectors that use ScoreMode.TOP_SCORES, and successive calls may only set increasing values of minScore.
        Overrides:
        setMinCompetitiveScore in class Scorable
        Throws:
        java.io.IOException
      • getChildren

        public final java.util.Collection<Scorable.ChildScorable> getChildren()
                                                                       throws java.io.IOException
        Description copied from class: Scorable
        Returns child sub-scorers positioned on the current document
        Overrides:
        getChildren in class Scorable
        Throws:
        java.io.IOException
      • iterator

        public DocIdSetIterator iterator()
        Description copied from class: Scorer
        Return a DocIdSetIterator over matching documents. The returned iterator will either be positioned on -1 if no documents have been scored yet, DocIdSetIterator.NO_MORE_DOCS if all documents have been scored already, or the last document id that has been scored otherwise. The returned iterator is a view: calling this method several times will return iterators that have the same state.
        Specified by:
        iterator in class Scorer
      • twoPhaseIterator

        public TwoPhaseIterator twoPhaseIterator()
        Description copied from class: Scorer
        Optional method: Return a TwoPhaseIterator view of this Scorer. A return value of null indicates that two-phase iteration is not supported. Note that the returned TwoPhaseIterator's approximation must advance synchronously with the Scorer.iterator(): advancing the approximation must advance the iterator and vice-versa. Implementing this method is typically useful on Scorers that have a high per-document overhead in order to confirm matches. The default implementation returns null.
        Overrides:
        twoPhaseIterator in class Scorer
      • addLead

        private void addLead​(DisiWrapper lead)
        Add a disi to the linked list of leads.
      • pushBackLeads

        private void pushBackLeads​(int target)
                            throws java.io.IOException
        Move disis that are in 'lead' back to the tail.
        Throws:
        java.io.IOException
      • advanceHead

        private void advanceHead​(int target)
                          throws java.io.IOException
        Make sure all disis in 'head' are on or after 'target'.
        Throws:
        java.io.IOException
      • advanceTail

        private void advanceTail​(DisiWrapper disi)
                          throws java.io.IOException
        Throws:
        java.io.IOException
      • advanceTail

        private void advanceTail()
                          throws java.io.IOException
        Pop the entry from the 'tail' that has the greatest score contribution, advance it to the current doc and then add it to 'lead' or 'head' depending on whether it matches.
        Throws:
        java.io.IOException
      • updateMaxScores

        private void updateMaxScores​(int target)
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • updateMaxScoresIfNecessary

        private void updateMaxScoresIfNecessary​(int target)
                                         throws java.io.IOException
        Update upTo and maximum scores of sub scorers so that upTo is greater than or equal to the next candidate after target, i.e. the top of `head`.
        Throws:
        java.io.IOException
      • moveToNextCandidate

        private void moveToNextCandidate​(int target)
                                  throws java.io.IOException
        Set 'doc' to the next potential match, and move all disis of 'head' that are on this doc into 'lead'.
        Throws:
        java.io.IOException
      • doNextCompetitiveCandidate

        private int doNextCompetitiveCandidate()
                                        throws java.io.IOException
        Move iterators to the tail until there is a potential match.
        Throws:
        java.io.IOException
      • advanceAllTail

        private void advanceAllTail()
                             throws java.io.IOException
        Advance all entries from the tail to know about all matches on the current doc.
        Throws:
        java.io.IOException
      • score

        public float score()
                    throws java.io.IOException
        Description copied from class: Scorable
        Returns the score of the current document matching the query.
        Specified by:
        score in class Scorable
        Throws:
        java.io.IOException
      • advanceShallow

        public int advanceShallow​(int target)
                           throws java.io.IOException
        Description copied from class: Scorer
        Advance to the block of documents that contains target in order to get scoring information about this block. This method is implicitly called by DocIdSetIterator.advance(int) and DocIdSetIterator.nextDoc() on the returned doc ID. Calling this method doesn't modify the current DocIdSetIterator.docID(). It returns a number that is greater than or equal to all documents contained in the current block, but less than any doc IDS of the next block. target must be >= Scorable.docID() as well as all targets that have been passed to Scorer.advanceShallow(int) so far.
        Overrides:
        advanceShallow in class Scorer
        Throws:
        java.io.IOException
      • getMaxScore

        public float getMaxScore​(int upTo)
                          throws java.io.IOException
        Description copied from class: Scorer
        Return the maximum score that documents between the last target that this iterator was shallow-advanced to included and upTo included.
        Specified by:
        getMaxScore in class Scorer
        Throws:
        java.io.IOException
      • docID

        public int docID()
        Description copied from class: Scorable
        Returns the doc ID that is currently being scored.
        Specified by:
        docID in class Scorable
      • insertTailWithOverFlow

        private DisiWrapper insertTailWithOverFlow​(DisiWrapper s)
        Insert an entry in 'tail' and evict the least-costly scorer if full.
      • addTail

        private void addTail​(DisiWrapper s)
        Add an entry to 'tail'. Fails if over capacity.
      • popTail

        private DisiWrapper popTail()
        Pop the least-costly scorer from 'tail'.
      • upHeapMaxScore

        private static void upHeapMaxScore​(DisiWrapper[] heap,
                                           int i)
        Heap helpers
      • downHeapMaxScore

        private static void downHeapMaxScore​(DisiWrapper[] heap,
                                             int size)
      • greaterMaxScore

        private static boolean greaterMaxScore​(DisiWrapper w1,
                                               DisiWrapper w2)
        In the tail, we want to get first entries that produce the maximum scores and in case of ties (eg. constant-score queries), those that have the least cost so that they are likely to advance further.