Package com.ibm.icu.impl.breakiter
Class LSTMBreakEngine
java.lang.Object
com.ibm.icu.impl.breakiter.DictionaryBreakEngine
com.ibm.icu.impl.breakiter.LSTMBreakEngine
- All Implemented Interfaces:
LanguageBreakEngine
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) class
static enum
(package private) class
static enum
static class
(package private) class
Nested classes/interfaces inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
DictionaryBreakEngine.DequeI, DictionaryBreakEngine.PossibleWord
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final LSTMBreakEngine.LSTMData
private int
private final LSTMBreakEngine.Vectorizer
private static final byte
private static final byte
Fields inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
fSet
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static void
addDotProductTo
(float[] a, float[][] b, float[] result) private static void
addHadamardProductTo
(float[] a, float[] b, float[] result) private static void
addTo
(float[] a, float[] result) private float[]
compute
(float[][] W, float[][] U, float[] B, float[] x, float[] h, float[] c) static LSTMBreakEngine
create
(int script, LSTMBreakEngine.LSTMData data) static LSTMBreakEngine.LSTMData
createData
(int script) static LSTMBreakEngine.LSTMData
createData
(UResourceBundle bundle) private static String
defaultLSTM
(int script) int
divideUpDictionaryRange
(CharacterIterator fIter, int rangeStart, int rangeEnd, DictionaryBreakEngine.DequeI foundBreaks, boolean isPhraseBreaking) Divide up a range of known dictionary characters handled by this break engine.private static void
hadamardProductTo
(float[] a, float[] result) boolean
handles
(int c) int
hashCode()
private static float[]
make1DArray
(int[] data, int start, int d1) private static float[][]
make2DArray
(int[] data, int start, int d1, int d2) private LSTMBreakEngine.Vectorizer
private static int
maxIndex
(float[] data) private static void
sigmoid
(float[] result, int start, int length) private static void
tanh
(float[] result, int start, int length) Methods inherited from class com.ibm.icu.impl.breakiter.DictionaryBreakEngine
findBreaks, setCharacters
-
Field Details
-
MIN_WORD
private static final byte MIN_WORD- See Also:
-
MIN_WORD_SPAN
private static final byte MIN_WORD_SPAN- See Also:
-
fData
-
fScript
private int fScript -
fVectorizer
-
-
Constructor Details
-
LSTMBreakEngine
-
-
Method Details
-
make2DArray
private static float[][] make2DArray(int[] data, int start, int d1, int d2) -
make1DArray
private static float[] make1DArray(int[] data, int start, int d1) -
makeVectorizer
-
hashCode
public int hashCode() -
handles
public boolean handles(int c) - Specified by:
handles
in interfaceLanguageBreakEngine
- Overrides:
handles
in classDictionaryBreakEngine
- Parameters:
c
- A Unicode codepoint value- Returns:
- true if the engine can handle this character, false otherwise
-
addDotProductTo
private static void addDotProductTo(float[] a, float[][] b, float[] result) -
addTo
private static void addTo(float[] a, float[] result) -
hadamardProductTo
private static void hadamardProductTo(float[] a, float[] result) -
addHadamardProductTo
private static void addHadamardProductTo(float[] a, float[] b, float[] result) -
sigmoid
private static void sigmoid(float[] result, int start, int length) -
tanh
private static void tanh(float[] result, int start, int length) -
maxIndex
private static int maxIndex(float[] data) -
compute
private float[] compute(float[][] W, float[][] U, float[] B, float[] x, float[] h, float[] c) -
divideUpDictionaryRange
public int divideUpDictionaryRange(CharacterIterator fIter, int rangeStart, int rangeEnd, DictionaryBreakEngine.DequeI foundBreaks, boolean isPhraseBreaking) Description copied from class:DictionaryBreakEngine
Divide up a range of known dictionary characters handled by this break engine.
- Specified by:
divideUpDictionaryRange
in classDictionaryBreakEngine
- Parameters:
fIter
- A UText representing the textrangeStart
- The start of the range of dictionary charactersrangeEnd
- The end of the range of dictionary charactersfoundBreaks
- Output of break positions. Positions are pushed. Pre-existing contents of the output stack are unaltered.- Returns:
- The number of breaks found
-
createData
-
defaultLSTM
-
createData
-
create
-