org.apache.shale.clay.parser
Class Parser

java.lang.Object
  extended by org.apache.shale.clay.parser.Parser

public class Parser
extends Object

Parses the document into a tree of nodes using the NodeTokenizer. Nodes are defined by a token or offset range in the document, Token. Attributes in beginning nodes are also parsed into token offsets by the AttributeTokenizer.

A document tree is built representing nodes in the target document. The document can be a HTML fragment that is not well-formed or an XML fragment of a XHTML document.


Field Summary
static org.apache.shale.clay.parser.Parser.Rule[] BEGIN_CDATA_RULES
          Declare an array of Parser.Rules that validate a begin CDATA Token.
static org.apache.shale.clay.parser.Parser.Rule[] BEGIN_COMMENT_TAG_RULES
          Declare an array of Parser.Rules that validate a begin comment Token.
static org.apache.shale.clay.parser.Parser.Rule[] BEGIN_TAG_RULES
          Declare an array of Parser.Rules that validate a begining Token.
static org.apache.shale.clay.parser.Parser.Rule[] DOCTYPE_TAG_RULES
          Declare an array of Parser.Rules that validate document type Token.
static org.apache.shale.clay.parser.Parser.Rule[] END_CDATA_RULES
          Declare an array of Parser.Rules that validate an end CDATA Token.
static String END_CHARSET_TOKEN
          The end of the comment token used to override the template encoding type.
static org.apache.shale.clay.parser.Parser.Rule[] END_COMMENT_TAG_RULES
          Declare an array of Parser.Rules that validate an end comment Token.
static String START_CHARSET_TOKEN
          The start of the comment token used to override the template encoding type.
 
Constructor Summary
Parser()
           
 
Method Summary
protected  Node buildNode(Token token)
          This is a factory method that builds a Node from a Token.
protected  void discoverNodeAttributes(Node node)
          If the Node is a starting tag and not a comment, use the AttributeTokenizer to realize the node attributes.
protected  void discoverNodeName(Node node)
          Extracts the node name from the Token if the Node is a starting or ending tag.
protected  void discoverNodeOverrides(Node node)
          Explicitly sets the isEnd Node property to true for self terminating tags.
protected  void discoverNodeShape(Node node)
          Determine if the Node is a starting, ending, or body text tag.
protected  Node findBeginingNode(Node current, Node node)
           
protected  boolean isNodeNameEqual(Node node1, Node node2)
          Compares two Node instances by name.
protected  boolean isOptionalEndingTag(String nodeName)
           Determines if a HTML nodeName is a type of tag that can optionally have a ending tag.
protected  boolean isSelfTerminating(String nodeName)
           Checks to see if the nodeName is within the SELF_TERMINATING table of values.
protected  boolean isValidOptionalEndingTagParent(String nodeName, String parentNodeName)
           Checks to see if a optional ending tag has a valid parent.
 List parse(StringBuffer document)
           Parse a document fragment into graphs of Node.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

START_CHARSET_TOKEN

public static final String START_CHARSET_TOKEN

The start of the comment token used to override the template encoding type.

See Also:
Constant Field Values

END_CHARSET_TOKEN

public static final String END_CHARSET_TOKEN

The end of the comment token used to override the template encoding type.

See Also:
Constant Field Values

BEGIN_CDATA_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_CDATA_RULES

Declare an array of Parser.Rules that validate a begin CDATA Token.


END_CDATA_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] END_CDATA_RULES

Declare an array of Parser.Rules that validate an end CDATA Token.


BEGIN_COMMENT_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_COMMENT_TAG_RULES

Declare an array of Parser.Rules that validate a begin comment Token.


END_COMMENT_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] END_COMMENT_TAG_RULES

Declare an array of Parser.Rules that validate an end comment Token.


DOCTYPE_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] DOCTYPE_TAG_RULES

Declare an array of Parser.Rules that validate document type Token.


BEGIN_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_TAG_RULES

Declare an array of Parser.Rules that validate a begining Token.

Constructor Detail

Parser

public Parser()
Method Detail

isOptionalEndingTag

protected boolean isOptionalEndingTag(String nodeName)

Determines if a HTML nodeName is a type of tag that can optionally have a ending tag.

Parameters:
nodeName - the name of the html node
Returns:
true if the nodeName is in the OPTIONAL-ENDING_TAG array; otherwise, false is returned

isValidOptionalEndingTagParent

protected boolean isValidOptionalEndingTagParent(String nodeName,
                                                 String parentNodeName)

Checks to see if a optional ending tag has a valid parent. This is use to detect a implicit ending tag

Parameters:
nodeName - of the optional ending tag
parentNodeName - name of the parent
Returns:
true if the parentNodeName is a valid parent for the nodeName; otherwise, a false value is returned

findBeginingNode

protected Node findBeginingNode(Node current,
                                Node node)
Parameters:
current - top of the stack
node - ending node
Returns:
begining node

parse

public List parse(StringBuffer document)

Parse a document fragment into graphs of Node. The resulting type is a list because the fragment might not be well-formed.

Parameters:
document - input source
Returns:
collection of Node

isNodeNameEqual

protected boolean isNodeNameEqual(Node node1,
                                  Node node2)

Compares two Node instances by name. This method is used to match a beginning tag with an ending tag while building the document stack. Returns true if the node name properties are the same.

Parameters:
node1 - first node
node2 - secnod node
Returns:
true if they are the same

isSelfTerminating

protected boolean isSelfTerminating(String nodeName)

Checks to see if the nodeName is within the SELF_TERMINATING table of values.

Parameters:
nodeName - to check for self termination
Returns:
true if is self terminating otherwise false

buildNode

protected Node buildNode(Token token)

This is a factory method that builds a Node from a Token.

Parameters:
token - node offset in the document
Returns:
node that describes the structure of the token

discoverNodeShape

protected void discoverNodeShape(Node node)

Determine if the Node is a starting, ending, or body text tag. The array of Parser.Shapes are used to determine the type of Node the Token representes.

Parameters:
node - target node

discoverNodeName

protected void discoverNodeName(Node node)

Extracts the node name from the Token if the Node is a starting or ending tag.

Parameters:
node - target

discoverNodeAttributes

protected void discoverNodeAttributes(Node node)

If the Node is a starting tag and not a comment, use the AttributeTokenizer to realize the node attributes.

Parameters:
node - target

discoverNodeOverrides

protected void discoverNodeOverrides(Node node)

Explicitly sets the isEnd Node property to true for self terminating tags. Sets the Node's isWellFormed property to true if the isStart and isEnd Node properties are true.

Parameters:
node - target


Copyright © 2004-2007 Apache Software Foundation. All Rights Reserved.