Parser (Shale Clay 1.0.4 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.shale.clay.parser
Class Parser

java.lang.Object
  org.apache.shale.clay.parser.Parser

public class Parser
extends Object
extends Object

Parses the document into a tree of nodes using the NodeTokenizer. Nodes are defined by a token or offset range in the document, Token. Attributes in beginning nodes are also parsed into token offsets by the AttributeTokenizer.

A document tree is built representing nodes in the target document. The document can be a HTML fragment that is not well-formed or an XML fragment of a XHTML document.

Field Summary
`static org.apache.shale.clay.parser.Parser.Rule[]`	`BEGIN_CDATA_RULES` Declare an array of `Parser.Rule`s that validate a begin CDATA `Token`.
`static org.apache.shale.clay.parser.Parser.Rule[]`	`BEGIN_COMMENT_TAG_RULES` Declare an array of `Parser.Rule`s that validate a begin comment `Token`.
`static org.apache.shale.clay.parser.Parser.Rule[]`	`BEGIN_TAG_RULES` Declare an array of `Parser.Rule`s that validate a begining `Token`.
`static org.apache.shale.clay.parser.Parser.Rule[]`	`DOCTYPE_TAG_RULES` Declare an array of `Parser.Rule`s that validate document type `Token`.
`static org.apache.shale.clay.parser.Parser.Rule[]`	`END_CDATA_RULES` Declare an array of `Parser.Rule`s that validate an end CDATA `Token`.
`static String`	`END_CHARSET_TOKEN` The end of the comment token used to override the template encoding type.
`static org.apache.shale.clay.parser.Parser.Rule[]`	`END_COMMENT_TAG_RULES` Declare an array of `Parser.Rule`s that validate an end comment `Token`.
`static String`	`START_CHARSET_TOKEN` The start of the comment token used to override the template encoding type.

Constructor Summary
`Parser()`

Method Summary
`protected Node`	`buildNode(Token token)` This is a factory method that builds a `Node` from a `Token`.
`protected void`	`discoverNodeAttributes(Node node)` If the `Node` is a starting tag and not a comment, use the `AttributeTokenizer` to realize the node attributes.
`protected void`	`discoverNodeName(Node node)` Extracts the node name from the `Token` if the `Node` is a starting or ending tag.
`protected void`	`discoverNodeOverrides(Node node)` Explicitly sets the `isEnd` `Node` property to `true` for self terminating tags.
`protected void`	`discoverNodeShape(Node node)` Determine if the `Node` is a starting, ending, or body text tag.
`protected Node`	`findBeginingNode(Node current, Node node)`
`protected boolean`	`isNodeNameEqual(Node node1, Node node2)` Compares two `Node` instances by `name`.
`protected boolean`	`isOptionalEndingTag(String nodeName)` Determines if a HTML nodeName is a type of tag that can optionally have a ending tag.
`protected boolean`	`isSelfTerminating(String nodeName)` Checks to see if the nodeName is within the `SELF_TERMINATING` table of values.
`protected boolean`	`isValidOptionalEndingTagParent(String nodeName, String parentNodeName)` Checks to see if a optional ending tag has a valid parent.
`List`	`parse(StringBuffer document)` Parse a document fragment into graphs of `Node`.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

START_CHARSET_TOKEN

public static final String START_CHARSET_TOKEN

The start of the comment token used to override the template encoding type.

See Also:: Constant Field Values

END_CHARSET_TOKEN

public static final String END_CHARSET_TOKEN

The end of the comment token used to override the template encoding type.

See Also:: Constant Field Values

BEGIN_CDATA_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_CDATA_RULES

Declare an array of Parser.Rules that validate a begin CDATA Token.

END_CDATA_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] END_CDATA_RULES

Declare an array of Parser.Rules that validate an end CDATA Token.

BEGIN_COMMENT_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_COMMENT_TAG_RULES

Declare an array of Parser.Rules that validate a begin comment Token.

END_COMMENT_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] END_COMMENT_TAG_RULES

Declare an array of Parser.Rules that validate an end comment Token.

DOCTYPE_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] DOCTYPE_TAG_RULES

Declare an array of Parser.Rules that validate document type Token.

BEGIN_TAG_RULES

public static final org.apache.shale.clay.parser.Parser.Rule[] BEGIN_TAG_RULES

Declare an array of Parser.Rules that validate a begining Token.

Constructor Detail

Parser

public Parser()

Method Detail

isOptionalEndingTag

protected boolean isOptionalEndingTag(String nodeName)

Determines if a HTML nodeName is a type of tag that can optionally have a ending tag.

Parameters:: nodeName - the name of the html node
Returns:: true if the nodeName is in the OPTIONAL-ENDING_TAG array; otherwise, false is returned

isValidOptionalEndingTagParent

protected boolean isValidOptionalEndingTagParent(String nodeName,
                                                 String parentNodeName)

Checks to see if a optional ending tag has a valid parent. This is use to detect a implicit ending tag

Parameters:: nodeName - of the optional ending tag; parentNodeName - name of the parent
Returns:: true if the parentNodeName is a valid parent for the nodeName; otherwise, a false value is returned

findBeginingNode

protected Node findBeginingNode(Node current,
                                Node node)

Parameters:: current - top of the stack; node - ending node
Returns:: begining node

parse

public List parse(StringBuffer document)

Parse a document fragment into graphs of Node. The resulting type is a list because the fragment might not be well-formed.

Parameters:: document - input source
Returns:: collection of Node

isNodeNameEqual

protected boolean isNodeNameEqual(Node node1,
                                  Node node2)

Compares two Node instances by name. This method is used to match a beginning tag with an ending tag while building the document stack. Returns true if the node name properties are the same.

Parameters:: node1 - first node; node2 - secnod node
Returns:: true if they are the same

isSelfTerminating

protected boolean isSelfTerminating(String nodeName)

Checks to see if the nodeName is within the SELF_TERMINATING table of values.

Parameters:: nodeName - to check for self termination
Returns:: true if is self terminating otherwise false

buildNode

protected Node buildNode(Token token)

This is a factory method that builds a Node from a Token.

Parameters:: token - node offset in the document
Returns:: node that describes the structure of the token

discoverNodeShape

protected void discoverNodeShape(Node node)

Determine if the Node is a starting, ending, or body text tag. The array of Parser.Shapes are used to determine the type of Node the Token representes.

Parameters:: node - target node

discoverNodeName

protected void discoverNodeName(Node node)

Extracts the node name from the Token if the Node is a starting or ending tag.

Parameters:: node - target

discoverNodeAttributes

protected void discoverNodeAttributes(Node node)

If the Node is a starting tag and not a comment, use the AttributeTokenizer to realize the node attributes.

Parameters:: node - target

discoverNodeOverrides

protected void discoverNodeOverrides(Node node)

Explicitly sets the isEnd Node property to true for self terminating tags. Sets the Node's isWellFormed property to true if the isStart and isEnd Node properties are true.

Parameters:: node - target