public interface HTMLParser
Modifier and Type | Method and Description |
---|---|
java.lang.String |
getCleanedText(java.lang.String string)
Removes any string artifacts placed in the text by the parser.
|
void |
parse(java.net.URL baseURL,
java.lang.String pageText,
DocumentAdapter adapter)
Parses the specified text string as a Document, registering it in the HTMLPage.
|
boolean |
supportsForceTagCase()
Returns true if this parser supports forcing the upper/lower case of tag and attribute names.
|
boolean |
supportsParserWarnings()
Returns true if this parser can display parser warnings.
|
boolean |
supportsPreserveTagCase()
Returns true if this parser supports preservation of the case of tag and attribute names.
|
boolean |
supportsReturnHTMLDocument()
Returns true if this parser can return an HTMLDocument object.
|
void parse(java.net.URL baseURL, java.lang.String pageText, DocumentAdapter adapter) throws java.io.IOException, org.xml.sax.SAXException
java.io.IOException
org.xml.sax.SAXException
java.lang.String getCleanedText(java.lang.String string)
boolean supportsPreserveTagCase()
boolean supportsForceTagCase()
boolean supportsReturnHTMLDocument()
boolean supportsParserWarnings()