Package | Description |
---|---|
org.htmlparser |
The basic API classes which will be used by most developers when working with
the HTML Parser.
|
org.htmlparser.beans |
The beans package contains Java Beans using the HTML Parser.
|
org.htmlparser.lexer |
The lexer package is the base level I/O subsystem.
|
org.htmlparser.nodeDecorators |
The nodeDecorators package contains classes that use the Decorator pattern.
|
org.htmlparser.nodes |
The nodes package has the concrete node implementations.
|
org.htmlparser.tags |
The tags package contains specific tags.
|
org.htmlparser.visitors |
The visitors package contains classes that use the Visitor pattern.
|
Modifier and Type | Field and Description |
---|---|
protected Text |
PrototypicalNodeFactory.mText
The prototypical text node.
|
Modifier and Type | Method and Description |
---|---|
Text |
PrototypicalNodeFactory.createStringNode(Page page,
int start,
int end)
Create a new string node.
|
Text |
StringNodeFactory.createStringNode(Page page,
int start,
int end)
Deprecated.
Create a new string node.
|
Text |
NodeFactory.createStringNode(Page page,
int start,
int end)
Create a new text node.
|
Text |
PrototypicalNodeFactory.getTextPrototype()
Get the object that is cloned to generate text nodes.
|
Modifier and Type | Method and Description |
---|---|
void |
PrototypicalNodeFactory.setTextPrototype(Text text)
Set the object to be used to generate text nodes.
|
Modifier and Type | Method and Description |
---|---|
void |
StringBean.visitStringNode(Text string)
Appends the text to the output.
|
Modifier and Type | Method and Description |
---|---|
Text |
Lexer.createStringNode(Page page,
int start,
int end)
Create a new string node.
|
Modifier and Type | Class and Description |
---|---|
class |
AbstractNodeDecorator
Deprecated.
Use direct subclasses or dynamic proxies instead.
Use either direct subclasses of the appropriate node and set them on the
Here is an example of how to use dynamic proxies to accomplish the same effect as using decorators to wrap Text nodes: import java.lang.reflect.InvocationHandler; import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.lang.reflect.Proxy; import org.htmlparser.Parser; import org.htmlparser.PrototypicalNodeFactory; import org.htmlparser.Text; import org.htmlparser.nodes.TextNode; import org.htmlparser.util.ParserException; public class TextProxy implements InvocationHandler { protected Object mObject; public static Object newInstance (Object object) { Class cls; cls = object.getClass (); return (Proxy.newProxyInstance ( cls.getClassLoader (), cls.getInterfaces (), new TextProxy (object))); } private TextProxy (Object object) { mObject = object; } public Object invoke (Object proxy, Method m, Object[] args) throws Throwable { Object result; String name; try { result = m.invoke (mObject, args); name = m.getName (); if (name.equals ("clone")) result = newInstance (result); // wrap the cloned object else if (name.equals ("doSemanticAction")) // or other methods System.out.println (mObject); // do the needful on the TextNode } catch (InvocationTargetException e) { throw e.getTargetException (); } catch (Exception e) { throw new RuntimeException ("unexpected invocation exception: " + e.getMessage()); } finally { } return (result); } public static void main (String[] args) throws ParserException { // create the wrapped text node and set it as the prototype Text text = (Text) TextProxy.newInstance (new TextNode (null, 0, 0)); PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.setTextPrototype (text); // perform the parse Parser parser = new Parser (args[0]); parser.setNodeFactory (factory); parser.parse (null); } } |
class |
DecodingNode
Deprecated.
Use direct subclasses or dynamic proxies instead.
Use either direct subclasses of the appropriate node and set them on the
|
class |
EscapeCharacterRemovingNode
Deprecated.
Use direct subclasses or dynamic proxies instead.
Use either direct subclasses of the appropriate node and set them on the
|
class |
NonBreakingSpaceConvertingNode
Deprecated.
Use direct subclasses or dynamic proxies instead.
Use either direct subclasses of the appropriate node and set them on the
|
Modifier and Type | Field and Description |
---|---|
protected Text |
AbstractNodeDecorator.delegate
Deprecated.
|
Constructor and Description |
---|
AbstractNodeDecorator(Text delegate)
Deprecated.
|
DecodingNode(Text node)
Deprecated.
|
EscapeCharacterRemovingNode(Text newDelegate)
Deprecated.
|
NonBreakingSpaceConvertingNode(Text newDelegate)
Deprecated.
|
Modifier and Type | Class and Description |
---|---|
class |
TextNode
Normal text in the HTML document is represented by this class.
|
Modifier and Type | Method and Description |
---|---|
Text[] |
CompositeTag.digupStringNode(java.lang.String searchText)
Finds a text node, however embedded it might be, and returns
it.
|
Modifier and Type | Method and Description |
---|---|
void |
StringFindingVisitor.visitStringNode(Text stringNode) |
void |
NodeVisitor.visitStringNode(Text string)
Called for each
StringNode visited. |
void |
UrlModifyingVisitor.visitStringNode(Text stringNode) |
void |
TextExtractingVisitor.visitStringNode(Text stringNode) |
HTML Parser is an open source library released under LGPL.