Package org.htmlunit
Class SgmlPage
java.lang.Object
org.htmlunit.html.DomNode
org.htmlunit.SgmlPage
- All Implemented Interfaces:
Serializable,Cloneable,Page,Document,Node
A basic class of Standard Generalized Markup Language (SGML), e.g. HTML and XML.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.htmlunit.html.DomNode
DomNode.ChildIterator, DomNode.DescendantDomElementsIterator, DomNode.DescendantDomNodesIterator, DomNode.DescendantElementsIterator<T extends DomNode>, DomNode.DescendantHtmlElementsIterator -
Field Summary
Fields inherited from class org.htmlunit.html.DomNode
PROPERTY_ELEMENT, READY_STATE_COMPLETE, READY_STATE_INTERACTIVE, READY_STATE_LOADED, READY_STATE_LOADING, READY_STATE_UNINITIALIZEDFields inherited from interface org.w3c.dom.Node
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_POSITION_CONTAINED_BY, DOCUMENT_POSITION_CONTAINS, DOCUMENT_POSITION_DISCONNECTED, DOCUMENT_POSITION_FOLLOWING, DOCUMENT_POSITION_IMPLEMENTATION_SPECIFIC, DOCUMENT_POSITION_PRECEDING, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NOTATION_NODE, PROCESSING_INSTRUCTION_NODE, TEXT_NODE -
Constructor Summary
ConstructorsConstructorDescriptionSgmlPage(WebResponse webResponse, WebWindow webWindow) Creates an instance of SgmlPage. -
Method Summary
Modifier and TypeMethodDescriptionasXml()Returns a string representation as XML document from this element and all it's children (recursively).
The charset used in the xml header is the current page encoding; but the result is still a string.voidInforms about the use of a characterDataChangeListener.voidcleanUp()Clean up this page.voidINTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles.voidclearComputedStyles(DomElement element) INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement.voidclearComputedStylesUpToRoot(DomElement element) INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElementand all parent elements.protected SgmlPageclone()Creates a clone of this instance.createAttribute(String name) createCDATASection(String data) createComment(String data) Creates an emptyDomDocumentFragmentobject.createNodeIterator(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion) Create a newNodeIteratorover the subtree rooted at the specified node.createTextNode(String data) voidInforms about the use of a domChangeListener.Returns the canonical XPath expression which identifies this node, for instance"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".abstract CharsetReturns the encoding.abstract StringReturns the content type of this page.final DocumentTypeReturns the document type.Returns the document element.getElementsByTagName(String tagName) getElementsByTagNameNS(String namespaceURI, String localName) Returns the window that this page is sitting inside.Gets the name for the current node.shortGets the type of the current node.getPage()Returns the page that contains this node.getUrl()Returns the URL of this page.Returns the WebClient that originally loaded this page.Returns the web response that was originally used to create this page.abstract booleanReturnstrueif this page has case-sensitive tag names,falseotherwise.booleanbooleanbooleanReturns true if this page is an HtmlPage.booleanvoidThe current implementation justDomNode.normalize()s the document element.protected voidsetDocumentType(DocumentType type) Sets the document type.voidsetEnclosingWindow(WebWindow window) Sets the window that contains this page.voidsetPrinting(boolean printing) Methods inherited from class org.htmlunit.html.DomNode
addCharacterDataChangeListener, addDomChangeListener, appendChild, asNormalizedText, basicRemove, checkChildHierarchy, cloneNode, closest, compareDocumentPosition, detach, fireCharacterDataChanged, fireNodeAdded, fireNodeDeleted, getAncestors, getAttributes, getBaseURI, getByXPath, getByXPath, getChildNodes, getChildren, getDescendants, getDomElementDescendants, getEndColumnNumber, getEndLineNumber, getFeature, getFirstByXPath, getFirstByXPath, getFirstChild, getHtmlElementDescendants, getHtmlPageOrNull, getIndex, getLastChild, getLocalName, getNamespaceURI, getNextElementSibling, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousElementSibling, getPreviousSibling, getReadyState, getScriptableObject, getSelectorList, getStartColumnNumber, getStartLineNumber, getTextContent, getUserData, getVisibleText, handles, hasAttributes, hasChildNodes, hasFeature, insertBefore, insertBefore, isAncestorOf, isAncestorOfAny, isAttachedToPage, isDefaultNamespace, isDisplayed, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, mayBeDisplayed, normalize, notifyIncorrectness, onAddedToDocumentFragment, onAddedToPage, onAllChildrenAddedToPage, parseHtmlSnippet, printChildrenAsXml, printXml, processImportNode, querySelector, querySelectorAll, quietlyRemoveAndMoveChildrenTo, remove, removeAllChildren, removeCharacterDataChangeListener, removeChild, removeDomChangeListener, replace, replaceChild, setEndLocation, setParentNode, setReadyState, setScriptableObject, setStartLocation, setTextContent, setUserDataMethods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.w3c.dom.Document
adoptNode, createAttributeNS, createElement, createElementNS, createEntityReference, createProcessingInstruction, getDocumentURI, getDomConfig, getElementById, getImplementation, getInputEncoding, getStrictErrorChecking, getXmlEncoding, getXmlStandalone, getXmlVersion, importNode, renameNode, setDocumentURI, setStrictErrorChecking, setXmlStandalone, setXmlVersionMethods inherited from interface org.w3c.dom.Node
appendChild, cloneNode, compareDocumentPosition, getAttributes, getBaseURI, getChildNodes, getFeature, getFirstChild, getLastChild, getLocalName, getNamespaceURI, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousSibling, getTextContent, getUserData, hasAttributes, hasChildNodes, insertBefore, isDefaultNamespace, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, normalize, removeChild, replaceChild, setNodeValue, setPrefix, setTextContent, setUserDataMethods inherited from interface org.htmlunit.Page
initialize
-
Constructor Details
-
SgmlPage
Creates an instance of SgmlPage.- Parameters:
webResponse- the web response that was used to create this pagewebWindow- the window that this page is being loaded into
-
-
Method Details
-
cleanUp
public void cleanUp()Clean up this page. This method gets called by the web client when an other page is loaded in the window and you should probably never need to call it directly -
getWebResponse
Returns the web response that was originally used to create this page.- Specified by:
getWebResponsein interfacePage- Returns:
- the web response
-
getNodeName
Gets the name for the current node.- Specified by:
getNodeNamein interfaceNode- Returns:
- the node name
-
getNodeType
public short getNodeType()Gets the type of the current node.- Specified by:
getNodeTypein interfaceNode- Returns:
- the node type
-
getEnclosingWindow
Returns the window that this page is sitting inside.- Specified by:
getEnclosingWindowin interfacePage- Returns:
- the enclosing frame or null if this page isn't inside a frame
-
setEnclosingWindow
Sets the window that contains this page.- Parameters:
window- the new frame or null if this page is being removed from a frame
-
getWebClient
Returns the WebClient that originally loaded this page.- Returns:
- the WebClient that originally loaded this page
-
createDocumentFragment
Creates an emptyDomDocumentFragmentobject.- Specified by:
createDocumentFragmentin interfaceDocument- Returns:
- a newly created
DomDocumentFragment
-
getDoctype
Returns the document type.- Specified by:
getDoctypein interfaceDocument- Returns:
- the document type
-
setDocumentType
Sets the document type.- Parameters:
type- the document type
-
getPage
Returns the page that contains this node. -
getCharset
Returns the encoding.- Returns:
- the encoding
-
getDocumentElement
Returns the document element.- Specified by:
getDocumentElementin interfaceDocument- Returns:
- the document element
-
clone
Creates a clone of this instance. -
asXml
Returns a string representation as XML document from this element and all it's children (recursively).
The charset used in the xml header is the current page encoding; but the result is still a string. You have to make sure to use the correct (in fact the same) encoding if you write this to a file.
This serializes the current state of the DomTree - this implies that the content of noscript tags usually serialized as string because the content is converted during parsing (if js was enabled at that time). -
hasCaseSensitiveTagNames
public abstract boolean hasCaseSensitiveTagNames()Returnstrueif this page has case-sensitive tag names,falseotherwise. In general, XML has case-sensitive tag names, and HTML doesn't. This is especially important during XPath matching.- Returns:
trueif this page has case-sensitive tag names,falseotherwise
-
normalizeDocument
public void normalizeDocument()The current implementation justDomNode.normalize()s the document element.- Specified by:
normalizeDocumentin interfaceDocument
-
getCanonicalXPath
Returns the canonical XPath expression which identifies this node, for instance
"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".WARNING: This sort of automated XPath expression is often quite bad at identifying a node, as it is highly sensitive to changes in the DOM tree.
- Overrides:
getCanonicalXPathin classDomNode- Returns:
- the canonical XPath expression which identifies this node
- See Also:
-
createAttribute
- Specified by:
createAttributein interfaceDocument
-
getUrl
Returns the URL of this page. -
isHtmlPage
public boolean isHtmlPage()Description copied from interface:PageReturns true if this page is an HtmlPage.- Specified by:
isHtmlPagein interfacePage- Returns:
- true or false
-
getElementsByTagName
- Specified by:
getElementsByTagNamein interfaceDocument
-
getElementsByTagNameNS
- Specified by:
getElementsByTagNameNSin interfaceDocument
-
createCDATASection
- Specified by:
createCDATASectionin interfaceDocument
-
createTextNode
- Specified by:
createTextNodein interfaceDocument
-
createComment
- Specified by:
createCommentin interfaceDocument
-
createNodeIterator
public DomNodeIterator createNodeIterator(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion) throws DOMException Create a newNodeIteratorover the subtree rooted at the specified node.- Parameters:
root- The node which will be iterated together with its children. TheNodeIteratoris initially positioned just before this node. ThewhatToShowflags and the filter, if any, are not considered when setting this position. The root must not benull.whatToShow- This flag specifies which node types may appear in the logical view of the tree presented by theNodeIterator. See the description ofNodeFilterfor the set of possibleSHOW_values.These flags can be combined usingOR.filter- TheNodeFilterto be used with thisNodeIterator, ornullto indicate no filter.entityReferenceExpansion- The value of this flag determines whether entity reference nodes are expanded.- Returns:
- The newly created
NodeIterator. - Throws:
DOMException- NOT_SUPPORTED_ERR: Raised if the specifiedrootisnull.
-
getContentType
Returns the content type of this page.- Returns:
- the content type of this page
-
clearComputedStyles
public void clearComputedStyles()INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles. -
clearComputedStyles
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement.- Parameters:
element- the element to clear its cache
-
clearComputedStylesUpToRoot
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElementand all parent elements.- Parameters:
element- the element to clear its cache
-
isPrinting
public boolean isPrinting()- Returns:
- whether or not this is currently printing
-
setPrinting
public void setPrinting(boolean printing) - Parameters:
printing- the printing state to set
-
domChangeListenerAdded
public void domChangeListenerAdded()Informs about the use of a domChangeListener. -
isDomChangeListenerInUse
public boolean isDomChangeListenerInUse()- Returns:
- true if at least one domChangeListener was registered.
-
characterDataChangeListenerAdded
public void characterDataChangeListenerAdded()Informs about the use of a characterDataChangeListener. -
isCharacterDataChangeListenerInUse
public boolean isCharacterDataChangeListenerInUse()- Returns:
- true if at least one characterDataChangeListener was registered.
-