java.lang.Object

org.htmlunit.html.DomNode

org.htmlunit.SgmlPage

All Implemented Interfaces:: Serializable, Cloneable, Page, Document, Node

Direct Known Subclasses:: HtmlPage, XmlPage

public abstract class SgmlPage extends DomNode implements Page, Document

A basic class of Standard Generalized Markup Language (SGML), e.g. HTML and XML.

See Also:

Nested Class Summary

Nested classes/interfaces inherited from class org.htmlunit.html.DomNode
DomNode.ChildIterator, DomNode.DescendantDomElementsIterator, DomNode.DescendantDomNodesIterator, DomNode.DescendantElementsIterator<T extends DomNode>, DomNode.DescendantHtmlElementsIterator
Field Summary

Fields inherited from class org.htmlunit.html.DomNode
PROPERTY_ELEMENT, READY_STATE_COMPLETE, READY_STATE_INTERACTIVE, READY_STATE_LOADED, READY_STATE_LOADING, READY_STATE_UNINITIALIZED

Fields inherited from interface org.w3c.dom.Node
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_POSITION_CONTAINED_BY, DOCUMENT_POSITION_CONTAINS, DOCUMENT_POSITION_DISCONNECTED, DOCUMENT_POSITION_FOLLOWING, DOCUMENT_POSITION_IMPLEMENTATION_SPECIFIC, DOCUMENT_POSITION_PRECEDING, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NOTATION_NODE, PROCESSING_INSTRUCTION_NODE, TEXT_NODE
Constructor Summary

Constructors

Constructor

Description

SgmlPage(WebResponse webResponse, WebWindow webWindow)

Creates an instance of SgmlPage.
Method Summary

Modifier and Type

Method

Description

String

asXml()

Returns a string representation as XML document from this element and all it's children (recursively).
The charset used in the xml header is the current page encoding; but the result is still a string.

void

characterDataChangeListenerAdded()

Informs about the use of a characterDataChangeListener.

void

cleanUp()

Clean up this page.

void

clearComputedStyles()

INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles.

void

clearComputedStyles(DomElement element)

INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specific Element.

void

clearComputedStylesUpToRoot(DomElement element)

INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specific Element and all parent elements.

protected SgmlPage

clone()

Creates a clone of this instance.

DomAttr

createAttribute(String name)

CDATASection

createCDATASection(String data)

Comment

createComment(String data)

DomDocumentFragment

createDocumentFragment()

Creates an empty DomDocumentFragment object.

DomNodeIterator

createNodeIterator(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion)

Create a new NodeIterator over the subtree rooted at the specified node.

Text

createTextNode(String data)

void

domChangeListenerAdded()

Informs about the use of a domChangeListener.

String

getCanonicalXPath()

Returns the canonical XPath expression which identifies this node, for instance "/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".

abstract Charset

getCharset()

Returns the encoding.

abstract String

getContentType()

Returns the content type of this page.

final DocumentType

getDoctype()

Returns the document type.

DomElement

getDocumentElement()

Returns the document element.

DomNodeList<DomElement>

getElementsByTagName(String tagName)

DomNodeList<DomElement>

getElementsByTagNameNS(String namespaceURI, String localName)

WebWindow

getEnclosingWindow()

Returns the window that this page is sitting inside.

String

getNodeName()

Gets the name for the current node.

short

getNodeType()

Gets the type of the current node.

SgmlPage

getPage()

Returns the page that contains this node.

URL

getUrl()

Returns the URL of this page.

WebClient

getWebClient()

Returns the WebClient that originally loaded this page.

WebResponse

getWebResponse()

Returns the web response that was originally used to create this page.

abstract boolean

hasCaseSensitiveTagNames()

Returns true if this page has case-sensitive tag names, false otherwise.

boolean

isCharacterDataChangeListenerInUse()

boolean

isDomChangeListenerInUse()

boolean

isHtmlPage()

Returns true if this page is an HtmlPage.

boolean

isPrinting()

void

normalizeDocument()

The current implementation just DomNode.normalize()s the document element.

protected void

setDocumentType(DocumentType type)

Sets the document type.

void

setEnclosingWindow(WebWindow window)

Sets the window that contains this page.

void

setPrinting(boolean printing)

Methods inherited from class org.htmlunit.html.DomNode
addCharacterDataChangeListener, addDomChangeListener, appendChild, asNormalizedText, basicRemove, checkChildHierarchy, cloneNode, closest, compareDocumentPosition, detach, fireCharacterDataChanged, fireNodeAdded, fireNodeDeleted, getAncestors, getAttributes, getBaseURI, getByXPath, getByXPath, getChildNodes, getChildren, getDescendants, getDomElementDescendants, getEndColumnNumber, getEndLineNumber, getFeature, getFirstByXPath, getFirstByXPath, getFirstChild, getHtmlElementDescendants, getHtmlPageOrNull, getIndex, getLastChild, getLocalName, getNamespaceURI, getNextElementSibling, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousElementSibling, getPreviousSibling, getReadyState, getScriptableObject, getSelectorList, getStartColumnNumber, getStartLineNumber, getTextContent, getUserData, getVisibleText, handles, hasAttributes, hasChildNodes, hasFeature, insertBefore, insertBefore, isAncestorOf, isAncestorOfAny, isAttachedToPage, isDefaultNamespace, isDisplayed, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, mayBeDisplayed, normalize, notifyIncorrectness, onAddedToDocumentFragment, onAddedToPage, onAllChildrenAddedToPage, parseHtmlSnippet, printChildrenAsXml, printXml, processImportNode, querySelector, querySelectorAll, quietlyRemoveAndMoveChildrenTo, remove, removeAllChildren, removeCharacterDataChangeListener, removeChild, removeDomChangeListener, replace, replaceChild, setEndLocation, setParentNode, setReadyState, setScriptableObject, setStartLocation, setTextContent, setUserData

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.w3c.dom.Document
adoptNode, createAttributeNS, createElement, createElementNS, createEntityReference, createProcessingInstruction, getDocumentURI, getDomConfig, getElementById, getImplementation, getInputEncoding, getStrictErrorChecking, getXmlEncoding, getXmlStandalone, getXmlVersion, importNode, renameNode, setDocumentURI, setStrictErrorChecking, setXmlStandalone, setXmlVersion

Methods inherited from interface org.w3c.dom.Node
appendChild, cloneNode, compareDocumentPosition, getAttributes, getBaseURI, getChildNodes, getFeature, getFirstChild, getLastChild, getLocalName, getNamespaceURI, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousSibling, getTextContent, getUserData, hasAttributes, hasChildNodes, insertBefore, isDefaultNamespace, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, normalize, removeChild, replaceChild, setNodeValue, setPrefix, setTextContent, setUserData

Methods inherited from interface org.htmlunit.Page
initialize

Constructor Details
- SgmlPage
  
  public SgmlPage(WebResponse webResponse, WebWindow webWindow)
  
  Creates an instance of SgmlPage.
  
  Parameters:
  
  webResponse - the web response that was used to create this page
  
  webWindow - the window that this page is being loaded into
Method Details
- cleanUp
  
  public void cleanUp()
  
  Clean up this page. This method gets called by the web client when an other page is loaded in the window and you should probably never need to call it directly
  
  Specified by:
  
  cleanUp in interface Page
- getWebResponse
  
  public WebResponse getWebResponse()
  
  Returns the web response that was originally used to create this page.
  
  Specified by:
  
  getWebResponse in interface Page
  
  Returns:
  
  the web response
- getNodeName
  
  public String getNodeName()
  
  Gets the name for the current node.
  
  Specified by:
  
  getNodeName in interface Node
  
  Returns:
  
  the node name
- getNodeType
  
  public short getNodeType()
  
  Gets the type of the current node.
  
  Specified by:
  
  getNodeType in interface Node
  
  Returns:
  
  the node type
- getEnclosingWindow
  
  public WebWindow getEnclosingWindow()
  
  Returns the window that this page is sitting inside.
  
  Specified by:
  
  getEnclosingWindow in interface Page
  
  Returns:
  
  the enclosing frame or null if this page isn't inside a frame
- setEnclosingWindow
  
  public void setEnclosingWindow(WebWindow window)
  
  Sets the window that contains this page.
  
  Parameters:
  
  window - the new frame or null if this page is being removed from a frame
- getWebClient
  
  public WebClient getWebClient()
  
  Returns the WebClient that originally loaded this page.
  
  Returns:
  
  the WebClient that originally loaded this page
- createDocumentFragment
  
  public DomDocumentFragment createDocumentFragment()
  
  Creates an empty DomDocumentFragment object.
  
  Specified by:
  
  createDocumentFragment in interface Document
  
  Returns:
  
  a newly created DomDocumentFragment
- getDoctype
  
  public final DocumentType getDoctype()
  
  Returns the document type.
  
  Specified by:
  
  getDoctype in interface Document
  
  Returns:
  
  the document type
- setDocumentType
  
  protected void setDocumentType(DocumentType type)
  
  Sets the document type.
  
  Parameters:
  
  type - the document type
- getPage
  
  public SgmlPage getPage()
  
  Returns the page that contains this node.
  
  Overrides:
  
  getPage in class DomNode
  
  Returns:
  
  the page that contains this node
- getCharset
  
  public abstract Charset getCharset()
  
  Returns the encoding.
  
  Returns:
  
  the encoding
- getDocumentElement
  
  public DomElement getDocumentElement()
  
  Returns the document element.
  
  Specified by:
  
  getDocumentElement in interface Document
  
  Returns:
  
  the document element
- clone
  
  protected SgmlPage clone()
  
  Creates a clone of this instance.
  
  Overrides:
  
  clone in class Object
  
  Returns:
  
  a clone of this instance
- asXml
  
  public String asXml()
  
  Returns a string representation as XML document from this element and all it's children (recursively).
  The charset used in the xml header is the current page encoding; but the result is still a string. You have to make sure to use the correct (in fact the same) encoding if you write this to a file.
  This serializes the current state of the DomTree - this implies that the content of noscript tags usually serialized as string because the content is converted during parsing (if js was enabled at that time).
  
  Overrides:
  
  asXml in class DomNode
  
  Returns:
  
  the XML string
- hasCaseSensitiveTagNames
  
  public abstract boolean hasCaseSensitiveTagNames()
  
  Returns true if this page has case-sensitive tag names, false otherwise. In general, XML has case-sensitive tag names, and HTML doesn't. This is especially important during XPath matching.
  
  Returns:
  
  true if this page has case-sensitive tag names, false otherwise
- normalizeDocument
  
  public void normalizeDocument()
  
  The current implementation just DomNode.normalize()s the document element.
  
  Specified by:
  
  normalizeDocument in interface Document
- getCanonicalXPath
  
  public String getCanonicalXPath()
  
  Returns the canonical XPath expression which identifies this node, for instance "/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".
  
  WARNING: This sort of automated XPath expression is often quite bad at identifying a node, as it is highly sensitive to changes in the DOM tree.
  Overrides:
  
  getCanonicalXPath in class DomNode
  
  Returns:
  
  the canonical XPath expression which identifies this node
  
  See Also:
  
  DomNode.getByXPath(String)
- createAttribute
  
  public DomAttr createAttribute(String name)
  
  Specified by:
  
  createAttribute in interface Document
- getUrl
  
  public URL getUrl()
  
  Returns the URL of this page.
  
  Specified by:
  
  getUrl in interface Page
  
  Returns:
  
  the URL of this page
- isHtmlPage
  
  public boolean isHtmlPage()
  
  Description copied from interface: Page
  
  Returns true if this page is an HtmlPage.
  
  Specified by:
  
  isHtmlPage in interface Page
  
  Returns:
  
  true or false
- getElementsByTagName
  
  public DomNodeList<DomElement> getElementsByTagName(String tagName)
  
  Specified by:
  
  getElementsByTagName in interface Document
- getElementsByTagNameNS
  
  public DomNodeList<DomElement> getElementsByTagNameNS(String namespaceURI, String localName)
  
  Specified by:
  
  getElementsByTagNameNS in interface Document
- createCDATASection
  
  public CDATASection createCDATASection(String data)
  
  Specified by:
  
  createCDATASection in interface Document
- createTextNode
  
  public Text createTextNode(String data)
  
  Specified by:
  
  createTextNode in interface Document
- createComment
  
  public Comment createComment(String data)
  
  Specified by:
  
  createComment in interface Document
- createNodeIterator
  
  public DomNodeIterator createNodeIterator(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion) throws DOMException
  
  Create a new NodeIterator over the subtree rooted at the specified node.
  
  Parameters:
  
  root - The node which will be iterated together with its children. The NodeIterator is initially positioned just before this node. The whatToShow flags and the filter, if any, are not considered when setting this position. The root must not be null.
  
  whatToShow - This flag specifies which node types may appear in the logical view of the tree presented by the NodeIterator. See the description of NodeFilter for the set of possible SHOW_ values.These flags can be combined using OR.
  
  filter - The NodeFilter to be used with this NodeIterator, or null to indicate no filter.
  
  entityReferenceExpansion - The value of this flag determines whether entity reference nodes are expanded.
  
  Returns:
  
  The newly created NodeIterator.
  
  Throws:
  
  DOMException - NOT_SUPPORTED_ERR: Raised if the specified root is null.
- getContentType
  
  public abstract String getContentType()
  
  Returns the content type of this page.
  
  Returns:
  
  the content type of this page
- clearComputedStyles
  
  public void clearComputedStyles()
  
  INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
  Clears the computed styles.
- clearComputedStyles
  
  public void clearComputedStyles(DomElement element)
  
  INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
  Clears the computed styles for a specific Element.
  
  Parameters:
  
  element - the element to clear its cache
- clearComputedStylesUpToRoot
  
  public void clearComputedStylesUpToRoot(DomElement element)
  
  INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
  Clears the computed styles for a specific Element and all parent elements.
  
  Parameters:
  
  element - the element to clear its cache
- isPrinting
  
  public boolean isPrinting()
  
  Returns:
  
  whether or not this is currently printing
- setPrinting
  
  public void setPrinting(boolean printing)
  
  Parameters:
  
  printing - the printing state to set
- domChangeListenerAdded
  
  public void domChangeListenerAdded()
  
  Informs about the use of a domChangeListener.
- isDomChangeListenerInUse
  
  public boolean isDomChangeListenerInUse()
  
  Returns:
  
  true if at least one domChangeListener was registered.
- characterDataChangeListenerAdded
  
  public void characterDataChangeListenerAdded()
  
  Informs about the use of a characterDataChangeListener.
- isCharacterDataChangeListenerInUse
  
  public boolean isCharacterDataChangeListenerInUse()
  
  Returns:
  
  true if at least one characterDataChangeListener was registered.

Class SgmlPage

Nested Class Summary

Nested classes/interfaces inherited from class org.htmlunit.html.DomNode

Field Summary

Fields inherited from class org.htmlunit.html.DomNode

Fields inherited from interface org.w3c.dom.Node

Constructor Summary

Method Summary

Methods inherited from class org.htmlunit.html.DomNode

Methods inherited from class java.lang.Object

Methods inherited from interface org.w3c.dom.Document

Methods inherited from interface org.w3c.dom.Node

Methods inherited from interface org.htmlunit.Page

Constructor Details

SgmlPage

Method Details

cleanUp

getWebResponse

getNodeName

getNodeType

getEnclosingWindow

setEnclosingWindow

getWebClient

createDocumentFragment

getDoctype

setDocumentType

getPage

getCharset

getDocumentElement

clone

asXml

hasCaseSensitiveTagNames

normalizeDocument

getCanonicalXPath

createAttribute

getUrl

isHtmlPage

getElementsByTagName

getElementsByTagNameNS

createCDATASection

createTextNode

createComment

createNodeIterator

getContentType

clearComputedStyles

clearComputedStyles

clearComputedStylesUpToRoot

isPrinting

setPrinting

domChangeListenerAdded

isDomChangeListenerInUse

characterDataChangeListenerAdded

isCharacterDataChangeListenerInUse