Class EndTag
-
- All Implemented Interfaces:
java.lang.CharSequence
,java.lang.Comparable<Segment>
public final class EndTag extends Tag
Represents the end tag of an element in a specific source document.An end tag always has a type that is a subclass of
EndTagType
, meaning it always starts with the characters '</
'.EndTag
instances are obtained using one of the following methods:Element.getEndTag()
Tag.getNextTag()
Tag.getPreviousTag()
Source.getPreviousEndTag(int pos)
Source.getPreviousEndTag(int pos, String name)
Source.getPreviousTag(int pos)
Source.getPreviousTag(int pos, TagType)
Source.getNextEndTag(int pos)
Source.getNextEndTag(int pos, String name)
Source.getNextEndTag(int pos, String name, EndTagType)
Source.getNextTag(int pos)
Source.getNextTag(int pos, TagType)
Source.getEnclosingTag(int pos)
Source.getEnclosingTag(int pos, TagType)
Source.getTagAt(int pos)
Segment.getAllTags()
Segment.getAllTags(TagType)
The
Tag
superclass defines thegetName()
method used to get the name of this end tag.See also the XML 1.0 specification for end tags.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static java.lang.String
generateHTML(java.lang.String tagName)
java.lang.String
getDebugInfo()
Returns a string representation of this object useful for debugging purposes.Element
getElement()
Returns the element that is ended by this end tag.EndTagType
getEndTagType()
Returns the type of this end tag.TagType
getTagType()
Returns the type of this tag.boolean
isUnregistered()
Indicates whether this tag has a syntax that does not match any of the registered tag types.java.lang.String
tidy()
Returns an XML representation of this end tag.-
Methods inherited from class net.htmlparser.jericho.Tag
getName, getNameSegment, getNextTag, getPreviousTag, getUserData, isXMLName, isXMLNameChar, isXMLNameStartChar, setUserData
-
Methods inherited from class net.htmlparser.jericho.Segment
charAt, compareTo, encloses, encloses, equals, getAllCharacterReferences, getAllElements, getAllElements, getAllElements, getAllElements, getAllElements, getAllElementsByClass, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTags, getAllStartTagsByClass, getAllTags, getAllTags, getBegin, getChildElements, getEnd, getFirstElement, getFirstElement, getFirstElement, getFirstElement, getFirstElementByClass, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTag, getFirstStartTagByClass, getFormControls, getFormFields, getMaxDepthIndicator, getNodeIterator, getRenderer, getRowColumnVector, getSource, getStyleURISegments, getTextExtractor, getURIAttributes, hashCode, ignoreWhenParsing, isWhiteSpace, isWhiteSpace, length, parseAttributes, subSequence, toString
-
-
-
-
Method Detail
-
getElement
public Element getElement()
Returns the element that is ended by this end tag.Returns
null
if this end tag is not properly matched to any start tag in the source document.This method is much less efficient than the
StartTag.getElement()
method.IMPLEMENTATION NOTE: The explanation for why this method is relatively inefficient lies in the fact that more than one start tag type can have the same corresponding end tag type, so it is not possible to know for certain which type of start tag this end tag is matched to (see
EndTagType.getCorrespondingStartTagType()
for more explanation). Because of this uncertainty, the implementation of this method must check every start tag preceding this end tag, calling itsStartTag.getElement()
method to see whether it is terminated by this end tag.- Specified by:
getElement
in classTag
- Returns:
- the element that is ended by this end tag.
-
getEndTagType
public EndTagType getEndTagType()
Returns the type of this end tag.This is equivalent to
(EndTagType)
getTagType()
.- Returns:
- the type of this end tag.
-
getTagType
public TagType getTagType()
Description copied from class:Tag
Returns the type of this tag.- Specified by:
getTagType
in classTag
- Returns:
- the type of this tag.
-
isUnregistered
public boolean isUnregistered()
Description copied from class:Tag
Indicates whether this tag has a syntax that does not match any of the registered tag types.The only requirement of an unregistered tag type is that it starts with '
<
' and there is a closing '>
' character at some position after it in the source document.The absence or presence of a '
/
' character after the initial '<
' determines whether an unregistered tag is respectively aStartTag
with a type ofStartTagType.UNREGISTERED
or anEndTag
with a type ofEndTagType.UNREGISTERED
.There are no restrictions on the characters that might appear between these delimiters, including other '
<
' characters. This may result in a '>
' character that is identified as the closing delimiter of two separate tags, one an unregistered tag, and the other a tag of any type that begins in the middle of the unregistered tag. As explained below, unregistered tags are usually only found when specifically looking for them, so it is up to the user to detect and deal with any such nonsensical results.Unregistered tags are only returned by the
Source.getTagAt(int pos)
method, named search methods, where the specifiedname
matches the first characters inside the tag, and by tag type search methods, where the specifiedtagType
is eitherStartTagType.UNREGISTERED
orEndTagType.UNREGISTERED
.Open tag searches and other searches always ignore unregistered tags, although every discovery of an unregistered tag is logged by the parser.
The logic behind this design is that unregistered tag types are usually the result of a '
<
' character in the text that was mistakenly left unencoded, or a less-than operator inside a script, or some other occurrence which is of no interest to the user. By returning unregistered tags in named and tag type search methods, the library allows the user to specifically search for tags with a certain syntax that does not match any existingTagType
. This expediency feature avoids the need for the user to create a custom tag type to define the syntax before searching for these tags. By not returning unregistered tags in the less specific search methods, it is providing only the information that most users are interested in.- Specified by:
isUnregistered
in classTag
- Returns:
true
if this tag has a syntax that does not match any of the registered tag types, otherwisefalse
.
-
tidy
public java.lang.String tidy()
Returns an XML representation of this end tag.The tidying of the tag is carried out as follows:
- if this end tag is a
NORMAL
end tag then any white space before the closing angle bracket is removed. - otherwise the original source text of the entire tag is returned.
- Specified by:
tidy
in classTag
- Returns:
- an XML representation of this end tag.
- See Also:
StartTag.tidy()
- if this end tag is a
-
generateHTML
public static java.lang.String generateHTML(java.lang.String tagName)
Generates the HTML text of a normal end tag with the specified tag name.- Example:
-
The following method call:
EndTag.generateHTML("INPUT")
</INPUT>
- Parameters:
tagName
- the name of the end tag.- Returns:
- the HTML text of a normal end tag with the specified tag name.
- See Also:
StartTag.generateHTML(String tagName, Map attributesMap, boolean emptyElementTag)
-
getDebugInfo
public java.lang.String getDebugInfo()
Description copied from class:Segment
Returns a string representation of this object useful for debugging purposes.- Overrides:
getDebugInfo
in classSegment
- Returns:
- a string representation of this object useful for debugging purposes.
-
-