Class XMLDocumentFragmentScannerImpl

  • All Implemented Interfaces:
    XMLEntityHandler, org.apache.xerces.xni.parser.XMLComponent, org.apache.xerces.xni.parser.XMLDocumentScanner, org.apache.xerces.xni.parser.XMLDocumentSource
    Direct Known Subclasses:
    XMLDocumentScannerImpl

    public class XMLDocumentFragmentScannerImpl
    extends XMLScanner
    implements org.apache.xerces.xni.parser.XMLDocumentScanner, org.apache.xerces.xni.parser.XMLComponent, XMLEntityHandler
    This class is responsible for scanning the structure and content of document fragments. The scanner acts as the source for the document information which is communicated to the document handler.

    This component requires the following features and properties from the component manager that uses it:

    • http://xml.org/sax/features/validation
    • http://apache.org/xml/features/scanner/notify-char-refs
    • http://apache.org/xml/features/scanner/notify-builtin-refs
    • http://apache.org/xml/properties/internal/symbol-table
    • http://apache.org/xml/properties/internal/error-reporter
    • http://apache.org/xml/properties/internal/entity-manager

    INTERNAL:

    Usage of this class is not supported. It may be altered or removed at any time.
    Version:
    $Id: XMLDocumentFragmentScannerImpl.java 572055 2007-09-02 17:55:43Z mrglavas $
    Author:
    Glenn Marcy, IBM, Andy Clark, IBM, Arnaud Le Hors, IBM, Eric Ye, IBM
    • Field Detail

      • SCANNER_STATE_START_OF_MARKUP

        protected static final int SCANNER_STATE_START_OF_MARKUP
        Scanner state: start of markup.
        See Also:
        Constant Field Values
      • SCANNER_STATE_COMMENT

        protected static final int SCANNER_STATE_COMMENT
        Scanner state: comment.
        See Also:
        Constant Field Values
      • SCANNER_STATE_PI

        protected static final int SCANNER_STATE_PI
        Scanner state: processing instruction.
        See Also:
        Constant Field Values
      • SCANNER_STATE_DOCTYPE

        protected static final int SCANNER_STATE_DOCTYPE
        Scanner state: DOCTYPE.
        See Also:
        Constant Field Values
      • SCANNER_STATE_ROOT_ELEMENT

        protected static final int SCANNER_STATE_ROOT_ELEMENT
        Scanner state: root element.
        See Also:
        Constant Field Values
      • SCANNER_STATE_CONTENT

        protected static final int SCANNER_STATE_CONTENT
        Scanner state: content.
        See Also:
        Constant Field Values
      • SCANNER_STATE_REFERENCE

        protected static final int SCANNER_STATE_REFERENCE
        Scanner state: reference.
        See Also:
        Constant Field Values
      • SCANNER_STATE_END_OF_INPUT

        protected static final int SCANNER_STATE_END_OF_INPUT
        Scanner state: end of input.
        See Also:
        Constant Field Values
      • SCANNER_STATE_TERMINATED

        protected static final int SCANNER_STATE_TERMINATED
        Scanner state: terminated.
        See Also:
        Constant Field Values
      • SCANNER_STATE_CDATA

        protected static final int SCANNER_STATE_CDATA
        Scanner state: CDATA section.
        See Also:
        Constant Field Values
      • SCANNER_STATE_TEXT_DECL

        protected static final int SCANNER_STATE_TEXT_DECL
        Scanner state: Text declaration.
        See Also:
        Constant Field Values
      • NAMESPACES

        protected static final java.lang.String NAMESPACES
        Feature identifier: namespaces.
        See Also:
        Constant Field Values
      • NOTIFY_BUILTIN_REFS

        protected static final java.lang.String NOTIFY_BUILTIN_REFS
        Feature identifier: notify built-in refereces.
        See Also:
        Constant Field Values
      • ENTITY_RESOLVER

        protected static final java.lang.String ENTITY_RESOLVER
        Property identifier: entity resolver.
        See Also:
        Constant Field Values
      • DEBUG_CONTENT_SCANNING

        protected static final boolean DEBUG_CONTENT_SCANNING
        Debug content dispatcher scanning.
        See Also:
        Constant Field Values
      • fDocumentHandler

        protected org.apache.xerces.xni.XMLDocumentHandler fDocumentHandler
        Document handler.
      • fEntityStack

        protected int[] fEntityStack
        Entity stack.
      • fMarkupDepth

        protected int fMarkupDepth
        Markup depth.
      • fScannerState

        protected int fScannerState
        Scanner state.
      • fInScanContent

        protected boolean fInScanContent
        SubScanner state: inside scanContent method.
      • fHasExternalDTD

        protected boolean fHasExternalDTD
        has external dtd
      • fStandalone

        protected boolean fStandalone
        Standalone.
      • fIsEntityDeclaredVC

        protected boolean fIsEntityDeclaredVC
        True if [Entity Declared] is a VC; false if it is a WFC.
      • fCurrentElement

        protected org.apache.xerces.xni.QName fCurrentElement
        Current element.
      • fNotifyBuiltInRefs

        protected boolean fNotifyBuiltInRefs
        Notify built-in references.
      • fElementQName

        protected final org.apache.xerces.xni.QName fElementQName
        Element QName.
      • fAttributeQName

        protected final org.apache.xerces.xni.QName fAttributeQName
        Attribute QName.
      • fTempString

        protected final org.apache.xerces.xni.XMLString fTempString
        String.
      • fTempString2

        protected final org.apache.xerces.xni.XMLString fTempString2
        String.
    • Constructor Detail

      • XMLDocumentFragmentScannerImpl

        public XMLDocumentFragmentScannerImpl()
        Default constructor.
    • Method Detail

      • setInputSource

        public void setInputSource​(org.apache.xerces.xni.parser.XMLInputSource inputSource)
                            throws java.io.IOException
        Sets the input source.
        Specified by:
        setInputSource in interface org.apache.xerces.xni.parser.XMLDocumentScanner
        Parameters:
        inputSource - The input source.
        Throws:
        java.io.IOException - Thrown on i/o error.
      • scanDocument

        public boolean scanDocument​(boolean complete)
                             throws java.io.IOException,
                                    org.apache.xerces.xni.XNIException
        Scans a document.
        Specified by:
        scanDocument in interface org.apache.xerces.xni.parser.XMLDocumentScanner
        Parameters:
        complete - True if the scanner should scan the document completely, pushing all events to the registered document handler. A value of false indicates that that the scanner should only scan the next portion of the document and return. A scanner instance is permitted to completely scan a document if it does not support this "pull" scanning model.
        Returns:
        True if there is more to scan, false otherwise.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • reset

        public void reset​(org.apache.xerces.xni.parser.XMLComponentManager componentManager)
                   throws org.apache.xerces.xni.parser.XMLConfigurationException
        Resets the component. The component can query the component manager about any features and properties that affect the operation of the component.
        Specified by:
        reset in interface org.apache.xerces.xni.parser.XMLComponent
        Overrides:
        reset in class XMLScanner
        Parameters:
        componentManager - The component manager.
        Throws:
        SAXException - Thrown by component on initialization error. For example, if a feature or property is required for the operation of the component, the component manager may throw a SAXNotRecognizedException or a SAXNotSupportedException.
        org.apache.xerces.xni.parser.XMLConfigurationException
      • getRecognizedFeatures

        public java.lang.String[] getRecognizedFeatures()
        Returns a list of feature identifiers that are recognized by this component. This method may return null if no features are recognized by this component.
        Specified by:
        getRecognizedFeatures in interface org.apache.xerces.xni.parser.XMLComponent
      • setFeature

        public void setFeature​(java.lang.String featureId,
                               boolean state)
                        throws org.apache.xerces.xni.parser.XMLConfigurationException
        Sets the state of a feature. This method is called by the component manager any time after reset when a feature changes state.

        Note: Components should silently ignore features that do not affect the operation of the component.

        Specified by:
        setFeature in interface org.apache.xerces.xni.parser.XMLComponent
        Overrides:
        setFeature in class XMLScanner
        Parameters:
        featureId - The feature identifier.
        state - The state of the feature.
        Throws:
        SAXNotRecognizedException - The component should not throw this exception.
        SAXNotSupportedException - The component should not throw this exception.
        org.apache.xerces.xni.parser.XMLConfigurationException - Thrown for configuration error. In general, components should only throw this exception if it is really a critical error.
      • getRecognizedProperties

        public java.lang.String[] getRecognizedProperties()
        Returns a list of property identifiers that are recognized by this component. This method may return null if no properties are recognized by this component.
        Specified by:
        getRecognizedProperties in interface org.apache.xerces.xni.parser.XMLComponent
      • setProperty

        public void setProperty​(java.lang.String propertyId,
                                java.lang.Object value)
                         throws org.apache.xerces.xni.parser.XMLConfigurationException
        Sets the value of a property. This method is called by the component manager any time after reset when a property changes value.

        Note: Components should silently ignore properties that do not affect the operation of the component.

        Specified by:
        setProperty in interface org.apache.xerces.xni.parser.XMLComponent
        Overrides:
        setProperty in class XMLScanner
        Parameters:
        propertyId - The property identifier.
        value - The value of the property.
        Throws:
        SAXNotRecognizedException - The component should not throw this exception.
        SAXNotSupportedException - The component should not throw this exception.
        org.apache.xerces.xni.parser.XMLConfigurationException - Thrown for configuration error. In general, components should only throw this exception if it is really a critical error.
      • getFeatureDefault

        public java.lang.Boolean getFeatureDefault​(java.lang.String featureId)
        Returns the default state for a feature, or null if this component does not want to report a default value for this feature.
        Specified by:
        getFeatureDefault in interface org.apache.xerces.xni.parser.XMLComponent
        Parameters:
        featureId - The feature identifier.
        Since:
        Xerces 2.2.0
      • getPropertyDefault

        public java.lang.Object getPropertyDefault​(java.lang.String propertyId)
        Returns the default state for a property, or null if this component does not want to report a default value for this property.
        Specified by:
        getPropertyDefault in interface org.apache.xerces.xni.parser.XMLComponent
        Parameters:
        propertyId - The property identifier.
        Since:
        Xerces 2.2.0
      • setDocumentHandler

        public void setDocumentHandler​(org.apache.xerces.xni.XMLDocumentHandler documentHandler)
        setDocumentHandler
        Specified by:
        setDocumentHandler in interface org.apache.xerces.xni.parser.XMLDocumentSource
        Parameters:
        documentHandler -
      • getDocumentHandler

        public org.apache.xerces.xni.XMLDocumentHandler getDocumentHandler()
        Returns the document handler
        Specified by:
        getDocumentHandler in interface org.apache.xerces.xni.parser.XMLDocumentSource
      • startEntity

        public void startEntity​(java.lang.String name,
                                org.apache.xerces.xni.XMLResourceIdentifier identifier,
                                java.lang.String encoding,
                                org.apache.xerces.xni.Augmentations augs)
                         throws org.apache.xerces.xni.XNIException
        This method notifies of the start of an entity. The DTD has the pseudo-name of "[dtd]" parameter entity names start with '%'; and general entities are just specified by their name.
        Specified by:
        startEntity in interface XMLEntityHandler
        Overrides:
        startEntity in class XMLScanner
        Parameters:
        name - The name of the entity.
        identifier - The resource identifier.
        encoding - The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
        augs - Additional information that may include infoset augmentations
        Throws:
        org.apache.xerces.xni.XNIException - Thrown by handler to signal an error.
      • endEntity

        public void endEntity​(java.lang.String name,
                              org.apache.xerces.xni.Augmentations augs)
                       throws org.apache.xerces.xni.XNIException
        This method notifies the end of an entity. The DTD has the pseudo-name of "[dtd]" parameter entity names start with '%'; and general entities are just specified by their name.
        Specified by:
        endEntity in interface XMLEntityHandler
        Overrides:
        endEntity in class XMLScanner
        Parameters:
        name - The name of the entity.
        augs - Additional information that may include infoset augmentations
        Throws:
        org.apache.xerces.xni.XNIException - Thrown by handler to signal an error.
      • scanXMLDeclOrTextDecl

        protected void scanXMLDeclOrTextDecl​(boolean scanningTextDecl)
                                      throws java.io.IOException,
                                             org.apache.xerces.xni.XNIException
        Scans an XML or text declaration.

         [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
         [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
         [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' |  "'" EncName "'" )
         [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
         [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
                         | ('"' ('yes' | 'no') '"'))
        
         [77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
         
        Parameters:
        scanningTextDecl - True if a text declaration is to be scanned instead of an XML declaration.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanPIData

        protected void scanPIData​(java.lang.String target,
                                  org.apache.xerces.xni.XMLString data)
                           throws java.io.IOException,
                                  org.apache.xerces.xni.XNIException
        Scans a processing data. This is needed to handle the situation where a document starts with a processing instruction whose target name starts with "xml". (e.g. xmlfoo)
        Overrides:
        scanPIData in class XMLScanner
        Parameters:
        target - The PI target
        data - The string to fill in with the data
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanComment

        protected void scanComment()
                            throws java.io.IOException,
                                   org.apache.xerces.xni.XNIException
        Scans a comment.

         [15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'
         

        Note: Called after scanning past '<!--'

        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanStartElement

        protected boolean scanStartElement()
                                    throws java.io.IOException,
                                           org.apache.xerces.xni.XNIException
        Scans a start element. This method will handle the binding of namespace information and notifying the handler of the start of the element.

         [44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
         [40] STag ::= '<' Name (S Attribute)* S? '>'
         

        Note: This method assumes that the leading '<' character has been consumed.

        Note: This method uses the fElementQName and fAttributes variables. The contents of these variables will be destroyed. The caller should copy important information out of these variables before calling this method.

        Returns:
        True if element is empty. (i.e. It matches production [44].
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanStartElementName

        protected void scanStartElementName()
                                     throws java.io.IOException,
                                            org.apache.xerces.xni.XNIException
        Scans the name of an element in a start or empty tag.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
        See Also:
        scanStartElement()
      • scanStartElementAfterName

        protected boolean scanStartElementAfterName()
                                             throws java.io.IOException,
                                                    org.apache.xerces.xni.XNIException
        Scans the remainder of a start or empty tag after the element name.
        Returns:
        True if element is empty.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
        See Also:
        scanStartElement()
      • scanAttribute

        protected void scanAttribute​(org.apache.xerces.xni.XMLAttributes attributes)
                              throws java.io.IOException,
                                     org.apache.xerces.xni.XNIException
        Scans an attribute.

         [41] Attribute ::= Name Eq AttValue
         

        Note: This method assumes that the next character on the stream is the first character of the attribute name.

        Note: This method uses the fAttributeQName and fQName variables. The contents of these variables will be destroyed.

        Parameters:
        attributes - The attributes list for the scanned attribute.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanContent

        protected int scanContent()
                           throws java.io.IOException,
                                  org.apache.xerces.xni.XNIException
        Scans element content.
        Returns:
        Returns the next character on the stream.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanCDATASection

        protected boolean scanCDATASection​(boolean complete)
                                    throws java.io.IOException,
                                           org.apache.xerces.xni.XNIException
        Scans a CDATA section.

        Note: This method uses the fTempString and fStringBuffer variables.

        Parameters:
        complete - True if the CDATA section is to be scanned completely.
        Returns:
        True if CDATA is completely scanned.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanEndElement

        protected int scanEndElement()
                              throws java.io.IOException,
                                     org.apache.xerces.xni.XNIException
        Scans an end element.

         [42] ETag ::= '</' Name S? '>'
         

        Note: This method uses the fElementQName variable. The contents of this variable will be destroyed. The caller should copy the needed information out of this variable before calling this method.

        Returns:
        The element depth.
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanCharReference

        protected void scanCharReference()
                                  throws java.io.IOException,
                                         org.apache.xerces.xni.XNIException
        Scans a character reference.

         [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'
         
        Throws:
        java.io.IOException
        org.apache.xerces.xni.XNIException
      • scanEntityReference

        protected void scanEntityReference()
                                    throws java.io.IOException,
                                           org.apache.xerces.xni.XNIException
        Scans an entity reference.
        Throws:
        java.io.IOException - Thrown if i/o error occurs.
        org.apache.xerces.xni.XNIException - Thrown if handler throws exception upon notification.
      • handleEndElement

        protected int handleEndElement​(org.apache.xerces.xni.QName element,
                                       boolean isEmpty)
                                throws org.apache.xerces.xni.XNIException
        Handles the end element. This method will make sure that the end element name matches the current element and notify the handler about the end of the element and the end of any relevent prefix mappings.

        Note: This method uses the fQName variable. The contents of this variable will be destroyed.

        Parameters:
        element - The element.
        Returns:
        The element depth.
        Throws:
        org.apache.xerces.xni.XNIException - Thrown if the handler throws a SAX exception upon notification.
      • setScannerState

        protected final void setScannerState​(int state)
        Sets the scanner state.
        Parameters:
        state - The new scanner state.
      • getScannerStateName

        protected java.lang.String getScannerStateName​(int state)
        Returns the scanner state name.