Class BM25Similarity.BM25Scorer

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private float avgdl
      The average document length.
      private float b
      b value for length normalization impact
      private float boost
      query boost
      private float[] cache
      precomputed norm[256] with k1 * ((1 - b) + b * dl / avgdl)
      private Explanation idf
      BM25's idf
      private float k1
      k1 value for scale factor
      private float weight
      weight (idf * boost)
    • Constructor Summary

      Constructors 
      Constructor Description
      BM25Scorer​(float boost, float k1, float b, Explanation idf, float avgdl, float[] cache)  
    • Field Detail

      • boost

        private final float boost
        query boost
      • k1

        private final float k1
        k1 value for scale factor
      • b

        private final float b
        b value for length normalization impact
      • avgdl

        private final float avgdl
        The average document length.
      • cache

        private final float[] cache
        precomputed norm[256] with k1 * ((1 - b) + b * dl / avgdl)
      • weight

        private final float weight
        weight (idf * boost)
    • Constructor Detail

      • BM25Scorer

        BM25Scorer​(float boost,
                   float k1,
                   float b,
                   Explanation idf,
                   float avgdl,
                   float[] cache)
    • Method Detail

      • score

        public float score​(float freq,
                           long encodedNorm)
        Description copied from class: Similarity.SimScorer
        Score a single document. freq is the document-term sloppy frequency and must be finite and positive. norm is the encoded normalization factor as computed by Similarity.computeNorm(FieldInvertState) at index time, or 1 if norms are disabled. norm is never 0.

        Score must not decrease when freq increases, ie. if freq1 > freq2, then score(freq1, norm) >= score(freq2, norm) for any value of norm that may be produced by Similarity.computeNorm(FieldInvertState).

        Score must not increase when the unsigned norm increases, ie. if Long.compareUnsigned(norm1, norm2) > 0 then score(freq, norm1) <= score(freq, norm2) for any legal freq.

        As a consequence, the maximum score that this scorer can produce is bound by score(Float.MAX_VALUE, 1).

        Specified by:
        score in class Similarity.SimScorer
        Parameters:
        freq - sloppy term frequency, must be finite and positive
        encodedNorm - encoded normalization factor or 1 if norms are disabled
        Returns:
        document's score
      • explainConstantFactors

        private java.util.List<Explanation> explainConstantFactors()