Class TransliteratorIDParser

java.lang.Object
com.ibm.icu.text.TransliteratorIDParser

class TransliteratorIDParser extends Object
Parsing component for transliterator IDs. This class contains only static members; it cannot be instantiated. Methods in this class parse various ID formats, including the following: A basic ID, which contains source, target, and variant, but no filter and no explicit inverse. Examples include "Latin-Greek/UNGEGN" and "Null". A single ID, which is a basic ID plus optional filter and optional explicit inverse. Examples include "[a-zA-Z] Latin-Greek" and "Lower (Upper)". A compound ID, which is a sequence of one or more single IDs, separated by semicolons, with optional forward and reverse global filters. The global filters are UnicodeSet patterns prepended or appended to the IDs, separated by semicolons. An appended filter must be enclosed in parentheses and applies in the reverse direction.
  • Field Details

  • Constructor Details

    • TransliteratorIDParser

      TransliteratorIDParser()
  • Method Details

    • parseFilterID

      public static TransliteratorIDParser.SingleID parseFilterID(String id, int[] pos)
      Parse a filter ID, that is, an ID of the general form "[f1] s1-t1/v1", with the filters optional, and the variants optional.
      Parameters:
      id - the id to be parsed
      pos - INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.
      Returns:
      a SingleID object or null if the parse fails
    • parseSingleID

      public static TransliteratorIDParser.SingleID parseSingleID(String id, int[] pos, int dir)
      Parse a single ID, that is, an ID of the general form "[f1] s1-t1/v1 ([f2] s2-t3/v2)", with the parenthesized element optional, the filters optional, and the variants optional.
      Parameters:
      id - the id to be parsed
      pos - INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.
      dir - the direction. If the direction is REVERSE then the SingleID is constructed for the reverse direction.
      Returns:
      a SingleID object or null
    • parseGlobalFilter

      public static UnicodeSet parseGlobalFilter(String id, int[] pos, int dir, int[] withParens, StringBuffer canonID)
      Parse a global filter of the form "[f]" or "([f])", depending on 'withParens'.
      Parameters:
      id - the pattern the parse
      pos - INPUT-OUTPUT parameter. On input, the position of the first character to parse. On output, the position after the last character parsed.
      dir - the direction.
      withParens - INPUT-OUTPUT parameter. On entry, if withParens[0] is 0, then parens are disallowed. If it is 1, then parens are requires. If it is -1, then parens are optional, and the return result will be set to 0 or 1.
      canonID - OUTPUT parameter. The pattern for the filter added to the canonID, either at the end, if dir is FORWARD, or at the start, if dir is REVERSE. The pattern will be enclosed in parentheses if appropriate, and will be suffixed with an ID_DELIM character. May be null.
      Returns:
      a UnicodeSet object or null. A non-null results indicates a successful parse, regardless of whether the filter applies to the given direction. The caller should discard it if withParens != (dir == REVERSE).
    • parseCompoundID

      public static boolean parseCompoundID(String id, int dir, StringBuffer canonID, List<TransliteratorIDParser.SingleID> list, UnicodeSet[] globalFilter)
      Parse a compound ID, consisting of an optional forward global filter, a separator, one or more single IDs delimited by separators, an an optional reverse global filter. The separator is a semicolon. The global filters are UnicodeSet patterns. The reverse global filter must be enclosed in parentheses.
      Parameters:
      id - the pattern the parse
      dir - the direction.
      canonID - OUTPUT parameter that receives the canonical ID, consisting of canonical IDs for all elements, as returned by parseSingleID(), separated by semicolons. Previous contents are discarded.
      list - OUTPUT parameter that receives a list of SingleID objects representing the parsed IDs. Previous contents are discarded.
      globalFilter - OUTPUT parameter that receives a pointer to a newly created global filter for this ID in this direction, or null if there is none.
      Returns:
      true if the parse succeeds, that is, if the entire id is consumed without syntax error.
    • instantiateList

      static List<Transliterator> instantiateList(List<TransliteratorIDParser.SingleID> ids)
      Returns the list of Transliterator objects for the given list of SingleID objects.
      Parameters:
      ids - list vector of SingleID objects.
      Returns:
      Actual transliterators for the list of SingleIDs
    • IDtoSTV

      public static String[] IDtoSTV(String id)
      Parse an ID into pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.
      Parameters:
      id - the id string, in any of several forms
      Returns:
      an array of 4 strings: source, target, variant, and isSourcePresent. If the source is not present, ANY will be given as the source, and isSourcePresent will be null. Otherwise isSourcePresent will be non-null. The target may be empty if the id is not well-formed. The variant may be empty.
    • STVtoID

      public static String STVtoID(String source, String target, String variant)
      Given source, target, and variant strings, concatenate them into a full ID. If the source is empty, then "Any" will be used for the source, so the ID will always be of the form s-t/v or s-t.
    • registerSpecialInverse

      public static void registerSpecialInverse(String target, String inverseTarget, boolean bidirectional)
      Register two targets as being inverses of one another. For example, calling registerSpecialInverse("NFC", "NFD", true) causes Transliterator to form the following inverse relationships:
      NFC => NFD
       Any-NFC => Any-NFD
       NFD => NFC
       Any-NFD => Any-NFC
      (Without the special inverse registration, the inverse of NFC would be NFC-Any.) Note that NFD is shorthand for Any-NFD, but that the presence or absence of "Any-" is preserved.

      The relationship is symmetrical; registering (a, b) is equivalent to registering (b, a).

      The relevant IDs must still be registered separately as factories or classes.

      Only the targets are specified. Special inverses always have the form Any-Target1 invalid input: '<'=> Any-Target2. The target should have canonical casing (the casing desired to be produced when an inverse is formed) and should contain no whitespace or other extraneous characters.

      Parameters:
      target - the target against which to register the inverse
      inverseTarget - the inverse of target, that is Any-target.getInverse() => Any-inverseTarget
      bidirectional - if true, register the reverse relation as well, that is, Any-inverseTarget.getInverse() => Any-target
    • parseFilterID

      private static TransliteratorIDParser.Specs parseFilterID(String id, int[] pos, boolean allowFilter)
      Parse an ID into component pieces. Take IDs of the form T, T/V, S-T, S-T/V, or S/V-T. If the source is missing, return a source of ANY.
      Parameters:
      id - the id string, in any of several forms
      pos - INPUT-OUTPUT parameter. On input, pos[0] is the offset of the first character to parse in id. On output, pos[0] is the offset after the last parsed character. If the parse failed, pos[0] will be unchanged.
      allowFilter - if true, a UnicodeSet pattern is allowed at any location between specs or delimiters, and is returned as the fifth string in the array.
      Returns:
      a Specs object, or null if the parse failed. If neither source nor target was seen in the parsed id, then the parse fails. If allowFilter is true, then the parsed filter pattern is returned in the Specs object, otherwise the returned filter reference is null. If the parse fails for any reason null is returned.
    • specsToID

      private static TransliteratorIDParser.SingleID specsToID(TransliteratorIDParser.Specs specs, int dir)
      Givens a Spec object, convert it to a SingleID object. The Spec object is a more unprocessed parse result. The SingleID object contains information about canonical and basic IDs.
      Returns:
      a SingleID; never returns null. Returned object always has 'filter' field of null.
    • specsToSpecialInverse

      private static TransliteratorIDParser.SingleID specsToSpecialInverse(TransliteratorIDParser.Specs specs)
      Given a Specs object, return a SingleID representing the special inverse of that ID. If there is no special inverse then return null.
      Returns:
      a SingleID or null. Returned object always has 'filter' field of null.