Class PorterStemmer


  • class PorterStemmer
    extends java.lang.Object
    Stemmer, implementing the Porter Stemming Algorithm The Stemmer class transforms a word into its root form. The input word can be provided a character at time (by calling add()), or at once by calling one of the various stem(something) methods.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private char[] b  
      private boolean dirty  
      private int i  
      private static int INITIAL_SIZE  
      private int j  
      private int k  
      private int k0  
    • Constructor Summary

      Constructors 
      Constructor Description
      PorterStemmer()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void add​(char ch)
      Add a character to the word being stemmed.
      private boolean cons​(int i)  
      private boolean cvc​(int i)  
      private boolean doublec​(int j)  
      private boolean ends​(java.lang.String s)  
      char[] getResultBuffer()
      Returns a reference to a character buffer containing the results of the stemming process.
      int getResultLength()
      Returns the length of the word resulting from the stemming process.
      private int m()  
      (package private) void r​(java.lang.String s)  
      void reset()
      reset() resets the stemmer so it can stem another word.
      (package private) void setto​(java.lang.String s)  
      boolean stem()
      Stem the word placed into the Stemmer buffer through calls to add().
      boolean stem​(char[] word)
      Stem a word contained in a char[].
      boolean stem​(char[] word, int wordLen)
      Stem a word contained in a leading portion of a char[] array.
      boolean stem​(char[] wordBuffer, int offset, int wordLen)
      Stem a word contained in a portion of a char[] array.
      boolean stem​(int i0)  
      java.lang.String stem​(java.lang.String s)
      Stem a word provided as a String.
      private void step1()  
      private void step2()  
      private void step3()  
      private void step4()  
      private void step5()  
      private void step6()  
      java.lang.String toString()
      After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
      private boolean vowelinstem()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
    • Field Detail

      • b

        private char[] b
      • i

        private int i
      • j

        private int j
      • k

        private int k
      • k0

        private int k0
      • dirty

        private boolean dirty
    • Constructor Detail

      • PorterStemmer

        public PorterStemmer()
    • Method Detail

      • reset

        public void reset()
        reset() resets the stemmer so it can stem another word. If you invoke the stemmer by calling add(char) and then stem(), you must call reset() before starting another word.
      • add

        public void add​(char ch)
        Add a character to the word being stemmed. When you are finished adding characters, you can call stem(void) to process the word.
      • toString

        public java.lang.String toString()
        After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)
        Overrides:
        toString in class java.lang.Object
      • getResultLength

        public int getResultLength()
        Returns the length of the word resulting from the stemming process.
      • getResultBuffer

        public char[] getResultBuffer()
        Returns a reference to a character buffer containing the results of the stemming process. You also need to consult getResultLength() to determine the length of the result.
      • cons

        private final boolean cons​(int i)
      • m

        private final int m()
      • vowelinstem

        private final boolean vowelinstem()
      • doublec

        private final boolean doublec​(int j)
      • cvc

        private final boolean cvc​(int i)
      • ends

        private final boolean ends​(java.lang.String s)
      • setto

        void setto​(java.lang.String s)
      • r

        void r​(java.lang.String s)
      • step1

        private final void step1()
      • step2

        private final void step2()
      • step3

        private final void step3()
      • step4

        private final void step4()
      • step5

        private final void step5()
      • step6

        private final void step6()
      • stem

        public java.lang.String stem​(java.lang.String s)
        Stem a word provided as a String. Returns the result as a String.
      • stem

        public boolean stem​(char[] word)
        Stem a word contained in a char[]. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
      • stem

        public boolean stem​(char[] wordBuffer,
                            int offset,
                            int wordLen)
        Stem a word contained in a portion of a char[] array. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
      • stem

        public boolean stem​(char[] word,
                            int wordLen)
        Stem a word contained in a leading portion of a char[] array. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
      • stem

        public boolean stem()
        Stem the word placed into the Stemmer buffer through calls to add(). Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
      • stem

        public boolean stem​(int i0)