Class HTMLElements
- java.lang.Object
-
- HTMLElements
-
public final class HTMLElements extends java.lang.Object
Contains static methods which group HTML element names by the characteristics of their associated elements.An HTML element is a normal element with a name that matches one of the HTML element names (ignoring case). This type of element spans the logical HTML element as described in the HTML 4.01 specification section 3.2.1, which may be implicitly terminated if it specifies an optional end tag.
The term Non-HTML element refers to a normal element with a name that does not match one of the HTML element names. This type of element must be either a single tag element or explicitly terminated.
All of the sets returned by the methods in this class may be modified to customise the behaviour of the parser. Care must be taken however to ensure that the sets only contain tag names in lower case.
Below is a table summarising the default characteristics of each HTML element. See also the index of elements in the HTML 4.01 specification and draft HTML5 specification for the official tables containing similar information.
- See Also:
HTMLElementName
,Element
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.util.Set<java.lang.String>
getBlockLevelElementNames()
Returns a set containing the names of all the block-level elements.static java.util.Set<java.lang.String>
getDeprecatedElementNames()
Returns a set containing the names of all deprecated elements in HTML 4.01.static java.util.List<java.lang.String>
getElementNames()
Returns a list containing all of the HTML element names.static java.util.Set<java.lang.String>
getEndTagForbiddenElementNames()
static java.util.Set<java.lang.String>
getEndTagOptionalElementNames()
static java.util.Set<java.lang.String>
getEndTagRequiredElementNames()
static java.util.Set<java.lang.String>
getInlineLevelElementNames()
Returns a set containing the names of all the inline-level elements.static java.util.Set<java.lang.String>
getNestingForbiddenElementNames()
Returns a set containing the names of all of the HTML elements which should never contain elements of the same name, either as direct or indirect descendants.static java.util.Set<java.lang.String>
getNonterminatingElementNames(java.lang.String endTagOptionalElementName)
Returns the names of elements that do NOT implicitly terminate an HTML element with the specified name.static java.util.Set<java.lang.String>
getStartTagOptionalElementNames()
static java.util.Set<java.lang.String>
getTerminatingEndTagNames(java.lang.String endTagOptionalElementName)
static java.util.Set<java.lang.String>
getTerminatingStartTagNames(java.lang.String endTagOptionalElementName)
-
-
-
Method Detail
-
getElementNames
public static final java.util.List<java.lang.String> getElementNames()
Returns a list containing all of the HTML element names.The returned list is in alphabetical order.
- Returns:
- a list containing of all the HTML element names.
-
getBlockLevelElementNames
public static java.util.Set<java.lang.String> getBlockLevelElementNames()
Returns a set containing the names of all the block-level elements.The element names contained in this set are:
ADDRESS
,article
,aside
,BLOCKQUOTE
,CENTER
,details
,DIR
,DIV
,DL
,FIELDSET
,footer
,FORM
,H1
,H2
,H3
,H4
,H5
,H6
,header
,hgroup
,HR
,ISINDEX
,MENU
,nav
,NOFRAMES
,NOSCRIPT
,OL
,P
,PRE
,section
,TABLE
,UL
This set is defined in the HTML 4.01 Transitional DTD, but more detailed information can be found in the HTML 4.01 specification section 7.5.3 - Block-level and inline elements and the CSS2 specification section 9.2.1 - Block-level elements and block boxes.
The CSS2 display property can be used to override the normal box type of an element.
- Returns:
- a set containing the names of all the block-level elements.
- See Also:
getInlineLevelElementNames()
-
getInlineLevelElementNames
public static java.util.Set<java.lang.String> getInlineLevelElementNames()
Returns a set containing the names of all the inline-level elements.The element names contained in this set are:
A
,ABBR
,ACRONYM
,APPLET
,B
,BASEFONT
,bdi
,BDO
,BIG
,BR
,BUTTON
,CITE
,CODE
,DEL
,DFN
,EM
,FONT
,I
,IFRAME
,IMG
,INPUT
,INS
,KBD
,keygen
,LABEL
,MAP
,mark
,meter
,OBJECT
,output
,progress
,Q
,rp
,rt
,ruby
,S
,SAMP
,SCRIPT
,SELECT
,SMALL
,SPAN
,STRIKE
,STRONG
,SUB
,SUP
,TEXTAREA
,time
,TT
,U
,VAR
,wbr
This set is defined in the HTML 4.01 Transitional DTD, but more detailed information can be found in the HTML 4.01 specification section 7.5.3 - Block-level and inline elements and the CSS2 specification section 9.2.2 - Inline-level elements and inline boxes.
The CSS2 display property can be used to override the normal box type of an element.
The HTML Document Type Definitions forbid the presence of block-level elements inside inline-level elements, but it is tolerated by all popular browsers in various situations, even in XHTML documents. The most notorious example of this is the common inclusion of block-level elements inside
FONT
elements.- Returns:
- a set containing the names of all the inline-level elements.
- See Also:
getBlockLevelElementNames()
-
getDeprecatedElementNames
public static java.util.Set<java.lang.String> getDeprecatedElementNames()
Returns a set containing the names of all deprecated elements in HTML 4.01.- Returns:
- a set containing the names of all deprecated elements in HTML 4.01.
-
getEndTagForbiddenElementNames
public static java.util.Set<java.lang.String> getEndTagForbiddenElementNames()
Returns a set containing the names of all of the HTML elements for which the end tag is forbidden.See the element parsing rules for HTML elements with forbidden end tags for more information.
The index of elements in the HTML 4.01 specification includes the letter 'F' in the "End Tag" column for elements whose end tag is forbidden.
- Returns:
- a set containing the names of all of the HTML elements for which the end tag is forbidden.
- See Also:
getEndTagOptionalElementNames()
,getEndTagRequiredElementNames()
-
getEndTagOptionalElementNames
public static java.util.Set<java.lang.String> getEndTagOptionalElementNames()
Returns a set containing the names of all of the HTML elements for which the end tag is optional.Elements with these names may be implicitly terminated by a subsequent terminating start tag or terminating end tag. A list of the these terminating tags, and the names of non-terminating elements that can be nested within the element, can be found in the documentation of each relevant element in the
HTMLElementName
class.See the element parsing rules for HTML elements with optional end tags for more information.
The index of elements in the HTML 4.01 specification includes the letter 'O' in the "End Tag" column for elements whose end tag is optional.
- Returns:
- a set containing the names of all of the HTML elements for which the end tag is optional.
- See Also:
getEndTagForbiddenElementNames()
,getEndTagRequiredElementNames()
-
getEndTagRequiredElementNames
public static java.util.Set<java.lang.String> getEndTagRequiredElementNames()
Returns a set containing the names of all of the HTML elements for which the end tag is required.See the element parsing rules for HTML elements with required end tags for more information.
The index of elements in the HTML 4.01 specification leaves the "End Tag" column blank for elements whose end tag is required.
- Returns:
- a set containing the names of all of the HTML elements for which the end tag is required.
- See Also:
getEndTagForbiddenElementNames()
,getEndTagOptionalElementNames()
-
getStartTagOptionalElementNames
public static java.util.Set<java.lang.String> getStartTagOptionalElementNames()
Returns a set containing the names of all of the HTML elements for which the start tag is optional.Elements with optional start tags must be present in the document object model (DOM) in certain locations, either forming part of the structure of the HTML document as a whole (e.g. the
HTML
,HEAD
, andBODY
elements), or forming part of the structure of aTABLE
element (e.g. theTBODY
element). The location of an omitted start tag in the document's object model can be inferred from the surrounding elements.This library does not use this property in any way when parsing documents, and does not construct a document object model from the source, so no implied element is created where an optional start tag is omitted.
When the start tag has been omitted in the document text, the corresponding end tag should also be omitted.
The index of elements in the HTML 4.01 specification includes the letter 'O' in the "Start Tag" column for elements whose start tag is optional.
- Returns:
- a set containing the names of all of the HTML elements for which the start tag is optional.
-
getTerminatingStartTagNames
public static java.util.Set<java.lang.String> getTerminatingStartTagNames(java.lang.String endTagOptionalElementName)
Returns the names of start tags that implicitly terminate an HTML element with the specified name.This method is only relevant to HTML elements for which the end tag is optional. It returns
null
ifgetEndTagOptionalElementNames()
.contains(endTagOptionalElementName.toLowerCase())==null
.- Parameters:
endTagOptionalElementName
- the name of an element for which the end tag is optional.- Returns:
- the names of start tags that implicitly terminate an HTML element with the specified name, or
null
if the name does not identify an element for which the end tag is optional. - See Also:
getTerminatingEndTagNames(String endTagOptionalElementName)
,getNonterminatingElementNames(String endTagOptionalElementName)
-
getTerminatingEndTagNames
public static java.util.Set<java.lang.String> getTerminatingEndTagNames(java.lang.String endTagOptionalElementName)
Returns the names of end tags that implicitly terminate an HTML element with the specified name.This method is only relevant to HTML elements for which the end tag is optional. It returns
null
ifgetEndTagOptionalElementNames()
.contains(endTagOptionalElementName.toLowerCase())==null
.Note that removing the tag name matching the specified element has no effect on the behaviour of the parser, as it is always assumed that a start tag is terminated by an end tag with a matching name.
- Parameters:
endTagOptionalElementName
- the name of an element for which the end tag is optional.- Returns:
- the names of end tags that implicitly terminate an HTML element with the specified name, or
null
if the name does not identify an element for which the end tag is optional. - See Also:
getTerminatingStartTagNames(String endTagOptionalElementName)
,getNonterminatingElementNames(String endTagOptionalElementName)
-
getNonterminatingElementNames
public static java.util.Set<java.lang.String> getNonterminatingElementNames(java.lang.String endTagOptionalElementName)
Returns the names of elements that do NOT implicitly terminate an HTML element with the specified name. Neither can any tag nested inside any of these elements implicitly terminate the specified element, even if it is listed as one of the terminating start tags or terminating end tags.This method is only relevant to HTML elements for which the end tag is optional. It returns
null
ifgetEndTagOptionalElementNames()
.contains(endTagOptionalElementName.toLowerCase())==null
.- Parameters:
endTagOptionalElementName
- the name of an element for which the end tag is optional.- Returns:
- the names of elements that do NOT implicitly terminate an HTML element with the specified name, or
null
if the name does not identify an element for which the end tag is optional. - See Also:
getTerminatingStartTagNames(String endTagOptionalElementName)
,getTerminatingEndTagNames(String endTagOptionalElementName)
-
getNestingForbiddenElementNames
public static java.util.Set<java.lang.String> getNestingForbiddenElementNames()
Returns a set containing the names of all of the HTML elements which should never contain elements of the same name, either as direct or indirect descendants.- Returns:
- a set containing the names of all of the HTML elements which should never contain elements of the same name.
-
-