Package | Description |
---|---|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
Modifier and Type | Class and Description |
---|---|
class |
WordBreakTestUnicode_6_3_0
This class was automatically generated by generateJavaUnicodeWordBreakTest.pl
from: http://www.unicode.org/Public/6.3.0/ucd/auxiliary/WordBreakTest.txt
WordBreakTest.txt indicates the points in the provided character sequences
at which conforming implementations must and must not break words.
|
Copyright © 2000–2018 The Apache Software Foundation. All rights reserved.