Package org.htmlunit
Class SgmlPage
- java.lang.Object
-
- org.htmlunit.html.DomNode
-
- org.htmlunit.SgmlPage
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,Page
,org.w3c.dom.Document
,org.w3c.dom.Node
public abstract class SgmlPage extends DomNode implements Page, org.w3c.dom.Document
A basic class of Standard Generalized Markup Language (SGML), e.g. HTML and XML.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.htmlunit.html.DomNode
DomNode.ChildIterator, DomNode.DescendantElementsIterator<T extends DomNode>
-
-
Field Summary
-
Fields inherited from class org.htmlunit.html.DomNode
PROPERTY_ELEMENT, READY_STATE_COMPLETE, READY_STATE_INTERACTIVE, READY_STATE_LOADED, READY_STATE_LOADING, READY_STATE_UNINITIALIZED
-
Fields inherited from interface org.w3c.dom.Node
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_POSITION_CONTAINED_BY, DOCUMENT_POSITION_CONTAINS, DOCUMENT_POSITION_DISCONNECTED, DOCUMENT_POSITION_FOLLOWING, DOCUMENT_POSITION_IMPLEMENTATION_SPECIFIC, DOCUMENT_POSITION_PRECEDING, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NOTATION_NODE, PROCESSING_INSTRUCTION_NODE, TEXT_NODE
-
-
Constructor Summary
Constructors Constructor Description SgmlPage(WebResponse webResponse, WebWindow webWindow)
Creates an instance of SgmlPage.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description java.lang.String
asXml()
Returns a string representation of the XML document from this element and all it's children (recursively).void
cleanUp()
Clean up this page.void
clearComputedStyles()
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles.void
clearComputedStyles(DomElement element)
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
.void
clearComputedStylesUpToRoot(DomElement element)
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
and all parent elements.protected SgmlPage
clone()
Creates a clone of this instance.DomAttr
createAttribute(java.lang.String name)
org.w3c.dom.CDATASection
createCDATASection(java.lang.String data)
org.w3c.dom.Comment
createComment(java.lang.String data)
DomDocumentFragment
createDocumentFragment()
Creates an emptyDomDocumentFragment
object.DomNodeIterator
createNodeIterator(org.w3c.dom.Node root, int whatToShow, org.w3c.dom.traversal.NodeFilter filter, boolean entityReferenceExpansion)
Create a newNodeIterator
over the subtree rooted at the specified node.org.w3c.dom.Text
createTextNode(java.lang.String data)
java.lang.String
getCanonicalXPath()
Returns the canonical XPath expression which identifies this node, for instance"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]"
.abstract java.nio.charset.Charset
getCharset()
Returns the encoding.abstract java.lang.String
getContentType()
Returns the content type of this page.org.w3c.dom.DocumentType
getDoctype()
Returns the document type.DomElement
getDocumentElement()
Returns the document element.DomNodeList<DomElement>
getElementsByTagName(java.lang.String tagName)
DomNodeList<DomElement>
getElementsByTagNameNS(java.lang.String namespaceURI, java.lang.String localName)
WebWindow
getEnclosingWindow()
Returns the window that this page is sitting inside.java.lang.String
getNodeName()
Gets the name for the current node.short
getNodeType()
Gets the type of the current node.SgmlPage
getPage()
Returns the page that contains this node.java.net.URL
getUrl()
Returns the URL of this page.WebClient
getWebClient()
Returns the WebClient that originally loaded this page.WebResponse
getWebResponse()
Returns the web response that was originally used to create this page.abstract boolean
hasCaseSensitiveTagNames()
Returnstrue
if this page has case-sensitive tag names,false
otherwise.boolean
isHtmlPage()
Returns true if this page is an HtmlPage.boolean
isPrinting()
void
normalizeDocument()
The current implementation justDomNode.normalize()
s the document element.protected void
setDocumentType(org.w3c.dom.DocumentType type)
Sets the document type.void
setEnclosingWindow(WebWindow window)
Sets the window that contains this page.void
setPrinting(boolean printing)
-
Methods inherited from class org.htmlunit.html.DomNode
addCharacterDataChangeListener, addDomChangeListener, appendChild, asNormalizedText, basicRemove, checkChildHierarchy, cloneNode, closest, compareDocumentPosition, detach, fireCharacterDataChanged, fireNodeAdded, fireNodeDeleted, getAncestors, getAttributes, getBaseURI, getByXPath, getByXPath, getChildNodes, getChildren, getDescendants, getDomElementDescendants, getEndColumnNumber, getEndLineNumber, getFeature, getFirstByXPath, getFirstByXPath, getFirstChild, getHtmlElementDescendants, getHtmlPageOrNull, getIndex, getLastChild, getLocalName, getNamespaceURI, getNextElementSibling, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousElementSibling, getPreviousSibling, getReadyState, getScriptableObject, getSelectorList, getStartColumnNumber, getStartLineNumber, getTextContent, getUserData, getVisibleText, handles, hasAttributes, hasChildNodes, hasFeature, insertBefore, insertBefore, isAncestorOf, isAncestorOfAny, isAttachedToPage, isDefaultNamespace, isDisplayed, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, mayBeDisplayed, normalize, notifyIncorrectness, onAddedToDocumentFragment, onAddedToPage, onAllChildrenAddedToPage, parseHtmlSnippet, printChildrenAsXml, printXml, processImportNode, querySelector, querySelectorAll, quietlyRemoveAndMoveChildrenTo, remove, removeAllChildren, removeCharacterDataChangeListener, removeChild, removeDomChangeListener, replace, replaceChild, setEndLocation, setParentNode, setReadyState, setScriptableObject, setStartLocation, setTextContent, setUserData
-
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.w3c.dom.Document
adoptNode, createAttributeNS, createElement, createElementNS, createEntityReference, createProcessingInstruction, getDocumentURI, getDomConfig, getElementById, getImplementation, getInputEncoding, getStrictErrorChecking, getXmlEncoding, getXmlStandalone, getXmlVersion, importNode, renameNode, setDocumentURI, setStrictErrorChecking, setXmlStandalone, setXmlVersion
-
Methods inherited from interface org.w3c.dom.Node
appendChild, cloneNode, compareDocumentPosition, getAttributes, getBaseURI, getChildNodes, getFeature, getFirstChild, getLastChild, getLocalName, getNamespaceURI, getNextSibling, getNodeValue, getOwnerDocument, getParentNode, getPrefix, getPreviousSibling, getTextContent, getUserData, hasAttributes, hasChildNodes, insertBefore, isDefaultNamespace, isEqualNode, isSameNode, isSupported, lookupNamespaceURI, lookupPrefix, normalize, removeChild, replaceChild, setNodeValue, setPrefix, setTextContent, setUserData
-
Methods inherited from interface org.htmlunit.Page
initialize
-
-
-
-
Constructor Detail
-
SgmlPage
public SgmlPage(WebResponse webResponse, WebWindow webWindow)
Creates an instance of SgmlPage.- Parameters:
webResponse
- the web response that was used to create this pagewebWindow
- the window that this page is being loaded into
-
-
Method Detail
-
cleanUp
public void cleanUp()
Clean up this page. This method gets called by the web client when an other page is loaded in the window and you should probably never need to call it directly
-
getWebResponse
public WebResponse getWebResponse()
Returns the web response that was originally used to create this page.- Specified by:
getWebResponse
in interfacePage
- Returns:
- the web response
-
getNodeName
public java.lang.String getNodeName()
Gets the name for the current node.- Specified by:
getNodeName
in interfaceorg.w3c.dom.Node
- Returns:
- the node name
-
getNodeType
public short getNodeType()
Gets the type of the current node.- Specified by:
getNodeType
in interfaceorg.w3c.dom.Node
- Returns:
- the node type
-
getEnclosingWindow
public WebWindow getEnclosingWindow()
Returns the window that this page is sitting inside.- Specified by:
getEnclosingWindow
in interfacePage
- Returns:
- the enclosing frame or null if this page isn't inside a frame
-
setEnclosingWindow
public void setEnclosingWindow(WebWindow window)
Sets the window that contains this page.- Parameters:
window
- the new frame or null if this page is being removed from a frame
-
getWebClient
public WebClient getWebClient()
Returns the WebClient that originally loaded this page.- Returns:
- the WebClient that originally loaded this page
-
createDocumentFragment
public DomDocumentFragment createDocumentFragment()
Creates an emptyDomDocumentFragment
object.- Specified by:
createDocumentFragment
in interfaceorg.w3c.dom.Document
- Returns:
- a newly created
DomDocumentFragment
-
getDoctype
public final org.w3c.dom.DocumentType getDoctype()
Returns the document type.- Specified by:
getDoctype
in interfaceorg.w3c.dom.Document
- Returns:
- the document type
-
setDocumentType
protected void setDocumentType(org.w3c.dom.DocumentType type)
Sets the document type.- Parameters:
type
- the document type
-
getPage
public SgmlPage getPage()
Returns the page that contains this node.
-
getCharset
public abstract java.nio.charset.Charset getCharset()
Returns the encoding.- Returns:
- the encoding
-
getDocumentElement
public DomElement getDocumentElement()
Returns the document element.- Specified by:
getDocumentElement
in interfaceorg.w3c.dom.Document
- Returns:
- the document element
-
clone
protected SgmlPage clone()
Creates a clone of this instance.- Overrides:
clone
in classjava.lang.Object
- Returns:
- a clone of this instance
-
asXml
public java.lang.String asXml()
Returns a string representation of the XML document from this element and all it's children (recursively). The charset used is the current page encoding.
-
hasCaseSensitiveTagNames
public abstract boolean hasCaseSensitiveTagNames()
Returnstrue
if this page has case-sensitive tag names,false
otherwise. In general, XML has case-sensitive tag names, and HTML doesn't. This is especially important during XPath matching.- Returns:
true
if this page has case-sensitive tag names,false
otherwise
-
normalizeDocument
public void normalizeDocument()
The current implementation justDomNode.normalize()
s the document element.- Specified by:
normalizeDocument
in interfaceorg.w3c.dom.Document
-
getCanonicalXPath
public java.lang.String getCanonicalXPath()
Returns the canonical XPath expression which identifies this node, for instance
"/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]"
.WARNING: This sort of automated XPath expression is often quite bad at identifying a node, as it is highly sensitive to changes in the DOM tree.
- Overrides:
getCanonicalXPath
in classDomNode
- Returns:
- the canonical XPath expression which identifies this node
- See Also:
DomNode.getByXPath(String)
-
createAttribute
public DomAttr createAttribute(java.lang.String name)
- Specified by:
createAttribute
in interfaceorg.w3c.dom.Document
-
getUrl
public java.net.URL getUrl()
Returns the URL of this page.
-
isHtmlPage
public boolean isHtmlPage()
Description copied from interface:Page
Returns true if this page is an HtmlPage.- Specified by:
isHtmlPage
in interfacePage
- Returns:
- true or false
-
getElementsByTagName
public DomNodeList<DomElement> getElementsByTagName(java.lang.String tagName)
- Specified by:
getElementsByTagName
in interfaceorg.w3c.dom.Document
-
getElementsByTagNameNS
public DomNodeList<DomElement> getElementsByTagNameNS(java.lang.String namespaceURI, java.lang.String localName)
- Specified by:
getElementsByTagNameNS
in interfaceorg.w3c.dom.Document
-
createCDATASection
public org.w3c.dom.CDATASection createCDATASection(java.lang.String data)
- Specified by:
createCDATASection
in interfaceorg.w3c.dom.Document
-
createTextNode
public org.w3c.dom.Text createTextNode(java.lang.String data)
- Specified by:
createTextNode
in interfaceorg.w3c.dom.Document
-
createComment
public org.w3c.dom.Comment createComment(java.lang.String data)
- Specified by:
createComment
in interfaceorg.w3c.dom.Document
-
createNodeIterator
public DomNodeIterator createNodeIterator(org.w3c.dom.Node root, int whatToShow, org.w3c.dom.traversal.NodeFilter filter, boolean entityReferenceExpansion) throws org.w3c.dom.DOMException
Create a newNodeIterator
over the subtree rooted at the specified node.- Parameters:
root
- The node which will be iterated together with its children. TheNodeIterator
is initially positioned just before this node. ThewhatToShow
flags and the filter, if any, are not considered when setting this position. The root must not benull
.whatToShow
- This flag specifies which node types may appear in the logical view of the tree presented by theNodeIterator
. See the description ofNodeFilter
for the set of possibleSHOW_
values.These flags can be combined usingOR
.filter
- TheNodeFilter
to be used with thisNodeIterator
, ornull
to indicate no filter.entityReferenceExpansion
- The value of this flag determines whether entity reference nodes are expanded.- Returns:
- The newly created
NodeIterator
. - Throws:
org.w3c.dom.DOMException
- NOT_SUPPORTED_ERR: Raised if the specifiedroot
isnull
.
-
getContentType
public abstract java.lang.String getContentType()
Returns the content type of this page.- Returns:
- the content type of this page
-
clearComputedStyles
public void clearComputedStyles()
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles.
-
clearComputedStyles
public void clearComputedStyles(DomElement element)
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
.- Parameters:
element
- the element to clear its cache
-
clearComputedStylesUpToRoot
public void clearComputedStylesUpToRoot(DomElement element)
INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
Clears the computed styles for a specificElement
and all parent elements.- Parameters:
element
- the element to clear its cache
-
isPrinting
public boolean isPrinting()
- Returns:
- whether or not this is currently printing
-
setPrinting
public void setPrinting(boolean printing)
- Parameters:
printing
- the printing state to set
-
-