Package org.htmlunit
Class SgmlPage
java.lang.Object
org.htmlunit.html.DomNode
org.htmlunit.SgmlPage
- All Implemented Interfaces:
Serializable
,Cloneable
,Page
,Document
,Node
A basic class of Standard Generalized Markup Language (SGML), e.g. HTML and XML.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.htmlunit.html.DomNode
DomNode.ChildIterator, DomNode.DescendantDomElementsIterator, DomNode.DescendantDomNodesIterator, DomNode.DescendantElementsIterator<T extends DomNode>, DomNode.DescendantHtmlElementsIterator
-
Field Summary
Fields inherited from class org.htmlunit.html.DomNode
PROPERTY_ELEMENT, READY_STATE_COMPLETE, READY_STATE_INTERACTIVE, READY_STATE_LOADED, READY_STATE_LOADING, READY_STATE_UNINITIALIZED
Fields inherited from interface org.w3c.dom.Node
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_POSITION_CONTAINED_BY, DOCUMENT_POSITION_CONTAINS, DOCUMENT_POSITION_DISCONNECTED, DOCUMENT_POSITION_FOLLOWING, DOCUMENT_POSITION_IMPLEMENTATION_SPECIFIC, DOCUMENT_POSITION_PRECEDING, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NOTATION_NODE, PROCESSING_INSTRUCTION_NODE, TEXT_NODE
-
Constructor Summary
ConstructorsConstructorDescriptionSgmlPage
(WebResponse webResponse, WebWindow webWindow) Creates an instance of SgmlPage. -
Method Summary
Modifier and TypeMethodDescriptionasXml()
Returns a string representation as XML document from this element and all it's children (recursively).
The charset used in the xml header is the current page encoding; but the result is still a string.void
Informs about the use of a characterDataChangeListener.void
cleanUp()
Clean up this page.void
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles.void
clearComputedStyles
(DomElement element) INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
.void
clearComputedStylesUpToRoot
(DomElement element) INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
and all parent elements.protected SgmlPage
clone()
Creates a clone of this instance.createAttribute
(String name) createCDATASection
(String data) createComment
(String data) Creates an emptyDomDocumentFragment
object.createNodeIterator
(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion) Create a newNodeIterator
over the subtree rooted at the specified node.createTextNode
(String data) void
Informs about the use of a domChangeListener.Returns the canonical XPath expression which identifies this node, for instance"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]"
.abstract Charset
Returns the encoding.abstract String
Returns the content type of this page.final DocumentType
Returns the document type.Returns the document element.getElementsByTagName
(String tagName) getElementsByTagNameNS
(String namespaceURI, String localName) Returns the window that this page is sitting inside.Gets the name for the current node.short
Gets the type of the current node.getPage()
Returns the page that contains this node.getUrl()
Returns the URL of this page.Returns the WebClient that originally loaded this page.Returns the web response that was originally used to create this page.abstract boolean
Returnstrue
if this page has case-sensitive tag names,false
otherwise.boolean
boolean
boolean
Returns true if this page is an HtmlPage.boolean
void
The current implementation justDomNode.normalize()
s the document element.protected void
setDocumentType
(DocumentType type) Sets the document type.void
setEnclosingWindow
(WebWindow window) Sets the window that contains this page.void
setPrinting
(boolean printing) Methods inherited from class org.htmlunit.html.DomNode
addCharacterDataChangeListener, addDomChangeListener, appendChild, asNormalizedText, basicRemove, checkChildHierarchy, cloneNode, closest, compareDocumentPosition, detach, fireCharacterDataChanged, fireNodeAdded, fireNodeDeleted, getAncestors, getAttributes, getBaseURI, getByXPath, getByXPath, getChildNodes, getChildren, getDescendants, getDomElementDescendants, getEndColumnNumber, getEndLineNumber, getFeature, getFirstByXPath, getFirstByXPath, getFirstChild, getHtmlElementDescendants, getHtmlPageOrNull, getIndex, getLastChild, getLocalName, getNamespaceURI, getNextElementSibling, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousElementSibling, getPreviousSibling, getReadyState, getScriptableObject, getSelectorList, getStartColumnNumber, getStartLineNumber, getTextContent, getUserData, getVisibleText, handles, hasAttributes, hasChildNodes, hasFeature, insertBefore, insertBefore, isAncestorOf, isAncestorOfAny, isAttachedToPage, isDefaultNamespace, isDisplayed, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, mayBeDisplayed, normalize, notifyIncorrectness, onAddedToDocumentFragment, onAddedToPage, onAllChildrenAddedToPage, parseHtmlSnippet, printChildrenAsXml, printXml, processImportNode, querySelector, querySelectorAll, quietlyRemoveAndMoveChildrenTo, remove, removeAllChildren, removeCharacterDataChangeListener, removeChild, removeDomChangeListener, replace, replaceChild, setEndLocation, setParentNode, setReadyState, setScriptableObject, setStartLocation, setTextContent, setUserData
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.w3c.dom.Document
adoptNode, createAttributeNS, createElement, createElementNS, createEntityReference, createProcessingInstruction, getDocumentURI, getDomConfig, getElementById, getImplementation, getInputEncoding, getStrictErrorChecking, getXmlEncoding, getXmlStandalone, getXmlVersion, importNode, renameNode, setDocumentURI, setStrictErrorChecking, setXmlStandalone, setXmlVersion
Methods inherited from interface org.w3c.dom.Node
appendChild, cloneNode, compareDocumentPosition, getAttributes, getBaseURI, getChildNodes, getFeature, getFirstChild, getLastChild, getLocalName, getNamespaceURI, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousSibling, getTextContent, getUserData, hasAttributes, hasChildNodes, insertBefore, isDefaultNamespace, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, normalize, removeChild, replaceChild, setNodeValue, setPrefix, setTextContent, setUserData
Methods inherited from interface org.htmlunit.Page
initialize
-
Constructor Details
-
SgmlPage
Creates an instance of SgmlPage.- Parameters:
webResponse
- the web response that was used to create this pagewebWindow
- the window that this page is being loaded into
-
-
Method Details
-
cleanUp
public void cleanUp()Clean up this page. This method gets called by the web client when an other page is loaded in the window and you should probably never need to call it directly -
getWebResponse
Returns the web response that was originally used to create this page.- Specified by:
getWebResponse
in interfacePage
- Returns:
- the web response
-
getNodeName
Gets the name for the current node.- Specified by:
getNodeName
in interfaceNode
- Returns:
- the node name
-
getNodeType
public short getNodeType()Gets the type of the current node.- Specified by:
getNodeType
in interfaceNode
- Returns:
- the node type
-
getEnclosingWindow
Returns the window that this page is sitting inside.- Specified by:
getEnclosingWindow
in interfacePage
- Returns:
- the enclosing frame or null if this page isn't inside a frame
-
setEnclosingWindow
Sets the window that contains this page.- Parameters:
window
- the new frame or null if this page is being removed from a frame
-
getWebClient
Returns the WebClient that originally loaded this page.- Returns:
- the WebClient that originally loaded this page
-
createDocumentFragment
Creates an emptyDomDocumentFragment
object.- Specified by:
createDocumentFragment
in interfaceDocument
- Returns:
- a newly created
DomDocumentFragment
-
getDoctype
Returns the document type.- Specified by:
getDoctype
in interfaceDocument
- Returns:
- the document type
-
setDocumentType
Sets the document type.- Parameters:
type
- the document type
-
getPage
Returns the page that contains this node. -
getCharset
Returns the encoding.- Returns:
- the encoding
-
getDocumentElement
Returns the document element.- Specified by:
getDocumentElement
in interfaceDocument
- Returns:
- the document element
-
clone
Creates a clone of this instance. -
asXml
Returns a string representation as XML document from this element and all it's children (recursively).
The charset used in the xml header is the current page encoding; but the result is still a string. You have to make sure to use the correct (in fact the same) encoding if you write this to a file.
This serializes the current state of the DomTree - this implies that the content of noscript tags usually serialized as string because the content is converted during parsing (if js was enabled at that time). -
hasCaseSensitiveTagNames
public abstract boolean hasCaseSensitiveTagNames()Returnstrue
if this page has case-sensitive tag names,false
otherwise. In general, XML has case-sensitive tag names, and HTML doesn't. This is especially important during XPath matching.- Returns:
true
if this page has case-sensitive tag names,false
otherwise
-
normalizeDocument
public void normalizeDocument()The current implementation justDomNode.normalize()
s the document element.- Specified by:
normalizeDocument
in interfaceDocument
-
getCanonicalXPath
Returns the canonical XPath expression which identifies this node, for instance
"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]"
.WARNING: This sort of automated XPath expression is often quite bad at identifying a node, as it is highly sensitive to changes in the DOM tree.
- Overrides:
getCanonicalXPath
in classDomNode
- Returns:
- the canonical XPath expression which identifies this node
- See Also:
-
createAttribute
- Specified by:
createAttribute
in interfaceDocument
-
getUrl
Returns the URL of this page. -
isHtmlPage
public boolean isHtmlPage()Description copied from interface:Page
Returns true if this page is an HtmlPage.- Specified by:
isHtmlPage
in interfacePage
- Returns:
- true or false
-
getElementsByTagName
- Specified by:
getElementsByTagName
in interfaceDocument
-
getElementsByTagNameNS
- Specified by:
getElementsByTagNameNS
in interfaceDocument
-
createCDATASection
- Specified by:
createCDATASection
in interfaceDocument
-
createTextNode
- Specified by:
createTextNode
in interfaceDocument
-
createComment
- Specified by:
createComment
in interfaceDocument
-
createNodeIterator
public DomNodeIterator createNodeIterator(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion) throws DOMException Create a newNodeIterator
over the subtree rooted at the specified node.- Parameters:
root
- The node which will be iterated together with its children. TheNodeIterator
is initially positioned just before this node. ThewhatToShow
flags and the filter, if any, are not considered when setting this position. The root must not benull
.whatToShow
- This flag specifies which node types may appear in the logical view of the tree presented by theNodeIterator
. See the description ofNodeFilter
for the set of possibleSHOW_
values.These flags can be combined usingOR
.filter
- TheNodeFilter
to be used with thisNodeIterator
, ornull
to indicate no filter.entityReferenceExpansion
- The value of this flag determines whether entity reference nodes are expanded.- Returns:
- The newly created
NodeIterator
. - Throws:
DOMException
- NOT_SUPPORTED_ERR: Raised if the specifiedroot
isnull
.
-
getContentType
Returns the content type of this page.- Returns:
- the content type of this page
-
clearComputedStyles
public void clearComputedStyles()INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles. -
clearComputedStyles
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
.- Parameters:
element
- the element to clear its cache
-
clearComputedStylesUpToRoot
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
and all parent elements.- Parameters:
element
- the element to clear its cache
-
isPrinting
public boolean isPrinting()- Returns:
- whether or not this is currently printing
-
setPrinting
public void setPrinting(boolean printing) - Parameters:
printing
- the printing state to set
-
domChangeListenerAdded
public void domChangeListenerAdded()Informs about the use of a domChangeListener. -
isDomChangeListenerInUse
public boolean isDomChangeListenerInUse()- Returns:
- true if at least one domChangeListener was registered.
-
characterDataChangeListenerAdded
public void characterDataChangeListenerAdded()Informs about the use of a characterDataChangeListener. -
isCharacterDataChangeListenerInUse
public boolean isCharacterDataChangeListenerInUse()- Returns:
- true if at least one characterDataChangeListener was registered.
-