dk.brics.automaton

Class Datatypes

public final class Datatypes extends Object

Basic automata for representing common datatypes related to Unicode, XML, and XML Schema.
Method Summary
static booleanexists(String name)
Checks whether a given automaton is available.
static Automatonget(String name)
Returns pre-built automaton.
static booleanisUnicodeBlockName(String name)
Checks whether the given string is the name of a Unicode block (see get).
static booleanisUnicodeCategoryName(String name)
Checks whether the given string is the name of a Unicode category (see get).
static booleanisXMLName(String name)
Checks whether the given string is the name of an XML / XML Schema automaton (see get).
static voidmain(String[] args)
Invoke during compilation to pre-build automata.

Method Detail

exists

public static boolean exists(String name)
Checks whether a given automaton is available.

Parameters: name automaton name

Returns: true if the automaton is available

get

public static Automaton get(String name)
Returns pre-built automaton. Automata are loaded as resources from the class loader of the Datatypes class. (Typically, the pre-built automata are stored in the same jar as this class.)

The following automata are available:

NameDescription
NCNameNCName from XML Namespaces 1.0
QNameQName from XML Namespaces 1.0
CharChar from XML 1.0
NameCharNameChar from XML 1.0
URIURI from RFC2396 with amendments from RFC2373
anynameoptional URI enclosed by brackets, followed by NCName
noapstrings not containing '@' and '%'
whitespaceoptional S from XML 1.0
whitespacechara single whitespace character from XML 1.0
stringstring from XML Schema Part 2
booleanboolean from XML Schema Part 2
decimaldecimal from XML Schema Part 2
floatfloat from XML Schema Part 2
integerinteger from XML Schema Part 2
durationduration from XML Schema Part 2
dateTimedateTime from XML Schema Part 2
timetime from XML Schema Part 2
datedate from XML Schema Part 2
gYearMonthgYearMonth from XML Schema Part 2
gYeargYear from XML Schema Part 2
gMonthDaygMonthDay from XML Schema Part 2
gDaygDay from XML Schema Part 2
hexBinaryhexBinary from XML Schema Part 2
base64Binarybase64Binary from XML Schema Part 2
NCName2NCName from XML Schema Part 2
NCNameslist of NCNames from XML Schema Part 2
QName2QName from XML Schema Part 2
Nmtoken2NMTOKEN from XML Schema Part 2
NmtokensNMTOKENS from XML Schema Part 2
Name2Name from XML Schema Part 2
Nameslist of Names from XML Schema Part 2
languagelanguage from XML Schema Part 2
BasicLatinBasicLatin block from Unicode 3.1
Latin-1SupplementLatin-1Supplement block from Unicode 3.1
LatinExtended-ALatinExtended-A block from Unicode 3.1
LatinExtended-BLatinExtended-B block from Unicode 3.1
IPAExtensionsIPAExtensions block from Unicode 3.1
SpacingModifierLettersSpacingModifierLetters block from Unicode 3.1
CombiningDiacriticalMarksCombiningDiacriticalMarks block from Unicode 3.1
GreekGreek block from Unicode 3.1
CyrillicCyrillic block from Unicode 3.1
ArmenianArmenian block from Unicode 3.1
HebrewHebrew block from Unicode 3.1
ArabicArabic block from Unicode 3.1
SyriacSyriac block from Unicode 3.1
ThaanaThaana block from Unicode 3.1
DevanagariDevanagari block from Unicode 3.1
BengaliBengali block from Unicode 3.1
GurmukhiGurmukhi block from Unicode 3.1
GujaratiGujarati block from Unicode 3.1
OriyaOriya block from Unicode 3.1
TamilTamil block from Unicode 3.1
TeluguTelugu block from Unicode 3.1
KannadaKannada block from Unicode 3.1
MalayalamMalayalam block from Unicode 3.1
SinhalaSinhala block from Unicode 3.1
ThaiThai block from Unicode 3.1
LaoLao block from Unicode 3.1
TibetanTibetan block from Unicode 3.1
MyanmarMyanmar block from Unicode 3.1
GeorgianGeorgian block from Unicode 3.1
HangulJamoHangulJamo block from Unicode 3.1
EthiopicEthiopic block from Unicode 3.1
CherokeeCherokee block from Unicode 3.1
UnifiedCanadianAboriginalSyllabicsUnifiedCanadianAboriginalSyllabics block from Unicode 3.1
OghamOgham block from Unicode 3.1
RunicRunic block from Unicode 3.1
KhmerKhmer block from Unicode 3.1
MongolianMongolian block from Unicode 3.1
LatinExtendedAdditionalLatinExtendedAdditional block from Unicode 3.1
GreekExtendedGreekExtended block from Unicode 3.1
GeneralPunctuationGeneralPunctuation block from Unicode 3.1
SuperscriptsandSubscriptsSuperscriptsandSubscripts block from Unicode 3.1
CurrencySymbolsCurrencySymbols block from Unicode 3.1
CombiningMarksforSymbolsCombiningMarksforSymbols block from Unicode 3.1
LetterlikeSymbolsLetterlikeSymbols block from Unicode 3.1
NumberFormsNumberForms block from Unicode 3.1
ArrowsArrows block from Unicode 3.1
MathematicalOperatorsMathematicalOperators block from Unicode 3.1
MiscellaneousTechnicalMiscellaneousTechnical block from Unicode 3.1
ControlPicturesControlPictures block from Unicode 3.1
OpticalCharacterRecognitionOpticalCharacterRecognition block from Unicode 3.1
EnclosedAlphanumericsEnclosedAlphanumerics block from Unicode 3.1
BoxDrawingBoxDrawing block from Unicode 3.1
BlockElementsBlockElements block from Unicode 3.1
GeometricShapesGeometricShapes block from Unicode 3.1
MiscellaneousSymbolsMiscellaneousSymbols block from Unicode 3.1
DingbatsDingbats block from Unicode 3.1
BraillePatternsBraillePatterns block from Unicode 3.1
CJKRadicalsSupplementCJKRadicalsSupplement block from Unicode 3.1
KangxiRadicalsKangxiRadicals block from Unicode 3.1
IdeographicDescriptionCharactersIdeographicDescriptionCharacters block from Unicode 3.1
CJKSymbolsandPunctuationCJKSymbolsandPunctuation block from Unicode 3.1
HiraganaHiragana block from Unicode 3.1
KatakanaKatakana block from Unicode 3.1
BopomofoBopomofo block from Unicode 3.1
HangulCompatibilityJamoHangulCompatibilityJamo block from Unicode 3.1
KanbunKanbun block from Unicode 3.1
BopomofoExtendedBopomofoExtended block from Unicode 3.1
EnclosedCJKLettersandMonthsEnclosedCJKLettersandMonths block from Unicode 3.1
CJKCompatibilityCJKCompatibility block from Unicode 3.1
CJKUnifiedIdeographsExtensionACJKUnifiedIdeographsExtensionA block from Unicode 3.1
CJKUnifiedIdeographsCJKUnifiedIdeographs block from Unicode 3.1
YiSyllablesYiSyllables block from Unicode 3.1
YiRadicalsYiRadicals block from Unicode 3.1
HangulSyllablesHangulSyllables block from Unicode 3.1
CJKCompatibilityIdeographsCJKCompatibilityIdeographs block from Unicode 3.1
AlphabeticPresentationFormsAlphabeticPresentationForms block from Unicode 3.1
ArabicPresentationForms-AArabicPresentationForms-A block from Unicode 3.1
CombiningHalfMarksCombiningHalfMarks block from Unicode 3.1
CJKCompatibilityFormsCJKCompatibilityForms block from Unicode 3.1
SmallFormVariantsSmallFormVariants block from Unicode 3.1
ArabicPresentationForms-BArabicPresentationForms-B block from Unicode 3.1
SpecialsSpecials block from Unicode 3.1
HalfwidthandFullwidthFormsHalfwidthandFullwidthForms block from Unicode 3.1
SpecialsSpecials block from Unicode 3.1
OldItalicOldItalic block from Unicode 3.1
GothicGothic block from Unicode 3.1
DeseretDeseret block from Unicode 3.1
ByzantineMusicalSymbolsByzantineMusicalSymbols block from Unicode 3.1
MusicalSymbolsMusicalSymbols block from Unicode 3.1
MathematicalAlphanumericSymbolsMathematicalAlphanumericSymbols block from Unicode 3.1
CJKUnifiedIdeographsExtensionBCJKUnifiedIdeographsExtensionB block from Unicode 3.1
CJKCompatibilityIdeographsSupplementCJKCompatibilityIdeographsSupplement block from Unicode 3.1
TagsTags block from Unicode 3.1
LuLu category from Unicode 3.1
LlLl category from Unicode 3.1
LtLt category from Unicode 3.1
LmLm category from Unicode 3.1
LoLo category from Unicode 3.1
LL category from Unicode 3.1
MnMn category from Unicode 3.1
McMc category from Unicode 3.1
MeMe category from Unicode 3.1
MM category from Unicode 3.1
NdNd category from Unicode 3.1
NlNl category from Unicode 3.1
NoNo category from Unicode 3.1
NN category from Unicode 3.1
PcPc category from Unicode 3.1
PdPd category from Unicode 3.1
PsPs category from Unicode 3.1
PePe category from Unicode 3.1
PiPi category from Unicode 3.1
PfPf category from Unicode 3.1
PoPo category from Unicode 3.1
PP category from Unicode 3.1
ZsZs category from Unicode 3.1
ZlZl category from Unicode 3.1
ZpZp category from Unicode 3.1
ZZ category from Unicode 3.1
SmSm category from Unicode 3.1
ScSc category from Unicode 3.1
SkSk category from Unicode 3.1
SoSo category from Unicode 3.1
SS category from Unicode 3.1
CcCc category from Unicode 3.1
CfCf category from Unicode 3.1
CoCo category from Unicode 3.1
CnCn category from Unicode 3.1
CC category from Unicode 3.1

Loaded automata are cached in memory.

Parameters: name name of automaton

Returns: automaton

isUnicodeBlockName

public static boolean isUnicodeBlockName(String name)
Checks whether the given string is the name of a Unicode block (see get).

isUnicodeCategoryName

public static boolean isUnicodeCategoryName(String name)
Checks whether the given string is the name of a Unicode category (see get).

isXMLName

public static boolean isXMLName(String name)
Checks whether the given string is the name of an XML / XML Schema automaton (see get).

main

public static void main(String[] args)
Invoke during compilation to pre-build automata. Automata are stored in the directory specified by the system property dk.brics.automaton.datatypes. (Default: build, relative to the current working directory.)
Copyright © 2001-2010 Anders Møller.