Package org.apache.lucene.analysis.icu
Class ICUNormalizer2CharFilter
- java.lang.Object
-
- java.io.Reader
-
- org.apache.lucene.analysis.CharFilter
-
- org.apache.lucene.analysis.charfilter.BaseCharFilter
-
- org.apache.lucene.analysis.icu.ICUNormalizer2CharFilter
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,java.lang.Readable
public final class ICUNormalizer2CharFilter extends BaseCharFilter
Normalize token text with ICU'sNormalizer2
.
-
-
Field Summary
Fields Modifier and Type Field Description private boolean
afterQuickCheckYes
private int
charCount
private int
checkedInputBoundary
private java.lang.StringBuilder
inputBuffer
private boolean
inputFinished
private com.ibm.icu.text.Normalizer2
normalizer
private java.lang.StringBuilder
resultBuffer
private CharacterUtils.CharacterBuffer
tmpBuffer
-
Fields inherited from class org.apache.lucene.analysis.CharFilter
input
-
-
Constructor Summary
Constructors Constructor Description ICUNormalizer2CharFilter(java.io.Reader in)
Create a new Normalizer2CharFilter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer)
Create a new Normalizer2CharFilter with the specified Normalizer2ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer, int bufferSize)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private int
normalizeInputUpto(int length)
private int
outputFromResultBuffer(char[] cbuf, int begin, int len)
int
read(char[] cbuf, int off, int len)
private int
readAndNormalizeFromInput()
private int
readFromInputWhileSpanQuickCheckYes()
private int
readFromIoNormalizeUptoBoundary()
private void
readInputToBuffer()
private void
recordOffsetDiff(int inputLength, int outputLength)
-
Methods inherited from class org.apache.lucene.analysis.charfilter.BaseCharFilter
addOffCorrectMap, correct, getLastCumulativeDiff
-
Methods inherited from class org.apache.lucene.analysis.CharFilter
close, correctOffset
-
-
-
-
Field Detail
-
normalizer
private final com.ibm.icu.text.Normalizer2 normalizer
-
inputBuffer
private final java.lang.StringBuilder inputBuffer
-
resultBuffer
private final java.lang.StringBuilder resultBuffer
-
inputFinished
private boolean inputFinished
-
afterQuickCheckYes
private boolean afterQuickCheckYes
-
checkedInputBoundary
private int checkedInputBoundary
-
charCount
private int charCount
-
tmpBuffer
private final CharacterUtils.CharacterBuffer tmpBuffer
-
-
Constructor Detail
-
ICUNormalizer2CharFilter
public ICUNormalizer2CharFilter(java.io.Reader in)
Create a new Normalizer2CharFilter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)
-
ICUNormalizer2CharFilter
public ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer)
Create a new Normalizer2CharFilter with the specified Normalizer2- Parameters:
in
- textnormalizer
- normalizer to use
-
ICUNormalizer2CharFilter
ICUNormalizer2CharFilter(java.io.Reader in, com.ibm.icu.text.Normalizer2 normalizer, int bufferSize)
-
-
Method Detail
-
read
public int read(char[] cbuf, int off, int len) throws java.io.IOException
- Specified by:
read
in classjava.io.Reader
- Throws:
java.io.IOException
-
readInputToBuffer
private void readInputToBuffer() throws java.io.IOException
- Throws:
java.io.IOException
-
readAndNormalizeFromInput
private int readAndNormalizeFromInput()
-
readFromInputWhileSpanQuickCheckYes
private int readFromInputWhileSpanQuickCheckYes()
-
readFromIoNormalizeUptoBoundary
private int readFromIoNormalizeUptoBoundary()
-
normalizeInputUpto
private int normalizeInputUpto(int length)
-
recordOffsetDiff
private void recordOffsetDiff(int inputLength, int outputLength)
-
outputFromResultBuffer
private int outputFromResultBuffer(char[] cbuf, int begin, int len)
-
-