Package org.htmlunit

Class SgmlPage

All Implemented Interfaces:
Serializable, Cloneable, Page, Document, Node
Direct Known Subclasses:
HtmlPage, XmlPage

public abstract class SgmlPage extends DomNode implements Page, Document
A basic class of Standard Generalized Markup Language (SGML), e.g. HTML and XML.
See Also:
  • Constructor Details

    • SgmlPage

      public SgmlPage(WebResponse webResponse, WebWindow webWindow)
      Creates an instance of SgmlPage.
      Parameters:
      webResponse - the web response that was used to create this page
      webWindow - the window that this page is being loaded into
  • Method Details

    • cleanUp

      public void cleanUp()
      Clean up this page. This method gets called by the web client when an other page is loaded in the window and you should probably never need to call it directly
      Specified by:
      cleanUp in interface Page
    • getWebResponse

      public WebResponse getWebResponse()
      Returns the web response that was originally used to create this page.
      Specified by:
      getWebResponse in interface Page
      Returns:
      the web response
    • getNodeName

      public String getNodeName()
      Gets the name for the current node.
      Specified by:
      getNodeName in interface Node
      Returns:
      the node name
    • getNodeType

      public short getNodeType()
      Gets the type of the current node.
      Specified by:
      getNodeType in interface Node
      Returns:
      the node type
    • getEnclosingWindow

      public WebWindow getEnclosingWindow()
      Returns the window that this page is sitting inside.
      Specified by:
      getEnclosingWindow in interface Page
      Returns:
      the enclosing frame or null if this page isn't inside a frame
    • setEnclosingWindow

      public void setEnclosingWindow(WebWindow window)
      Sets the window that contains this page.
      Parameters:
      window - the new frame or null if this page is being removed from a frame
    • getWebClient

      public WebClient getWebClient()
      Returns the WebClient that originally loaded this page.
      Returns:
      the WebClient that originally loaded this page
    • createDocumentFragment

      public DomDocumentFragment createDocumentFragment()
      Creates an empty DomDocumentFragment object.
      Specified by:
      createDocumentFragment in interface Document
      Returns:
      a newly created DomDocumentFragment
    • getDoctype

      public final DocumentType getDoctype()
      Returns the document type.
      Specified by:
      getDoctype in interface Document
      Returns:
      the document type
    • setDocumentType

      protected void setDocumentType(DocumentType type)
      Sets the document type.
      Parameters:
      type - the document type
    • getPage

      public SgmlPage getPage()
      Returns the page that contains this node.
      Overrides:
      getPage in class DomNode
      Returns:
      the page that contains this node
    • getCharset

      public abstract Charset getCharset()
      Returns the encoding.
      Returns:
      the encoding
    • getDocumentElement

      public DomElement getDocumentElement()
      Returns the document element.
      Specified by:
      getDocumentElement in interface Document
      Returns:
      the document element
    • clone

      protected SgmlPage clone()
      Creates a clone of this instance.
      Overrides:
      clone in class Object
      Returns:
      a clone of this instance
    • asXml

      public String asXml()
      Returns a string representation as XML document from this element and all it's children (recursively).
      The charset used in the xml header is the current page encoding; but the result is still a string. You have to make sure to use the correct (in fact the same) encoding if you write this to a file.
      This serializes the current state of the DomTree - this implies that the content of noscript tags usually serialized as string because the content is converted during parsing (if js was enabled at that time).
      Overrides:
      asXml in class DomNode
      Returns:
      the XML string
    • hasCaseSensitiveTagNames

      public abstract boolean hasCaseSensitiveTagNames()
      Returns true if this page has case-sensitive tag names, false otherwise. In general, XML has case-sensitive tag names, and HTML doesn't. This is especially important during XPath matching.
      Returns:
      true if this page has case-sensitive tag names, false otherwise
    • normalizeDocument

      public void normalizeDocument()
      The current implementation just DomNode.normalize()s the document element.
      Specified by:
      normalizeDocument in interface Document
    • getCanonicalXPath

      public String getCanonicalXPath()

      Returns the canonical XPath expression which identifies this node, for instance "/html/body/table[3]/tbody/tr[5]/td[2]/span/a[3]".

      WARNING: This sort of automated XPath expression is often quite bad at identifying a node, as it is highly sensitive to changes in the DOM tree.

      Overrides:
      getCanonicalXPath in class DomNode
      Returns:
      the canonical XPath expression which identifies this node
      See Also:
    • createAttribute

      public DomAttr createAttribute(String name)
      Specified by:
      createAttribute in interface Document
    • getUrl

      public URL getUrl()
      Returns the URL of this page.
      Specified by:
      getUrl in interface Page
      Returns:
      the URL of this page
    • isHtmlPage

      public boolean isHtmlPage()
      Description copied from interface: Page
      Returns true if this page is an HtmlPage.
      Specified by:
      isHtmlPage in interface Page
      Returns:
      true or false
    • getElementsByTagName

      public DomNodeList<DomElement> getElementsByTagName(String tagName)
      Specified by:
      getElementsByTagName in interface Document
    • getElementsByTagNameNS

      public DomNodeList<DomElement> getElementsByTagNameNS(String namespaceURI, String localName)
      Specified by:
      getElementsByTagNameNS in interface Document
    • createCDATASection

      public CDATASection createCDATASection(String data)
      Specified by:
      createCDATASection in interface Document
    • createTextNode

      public Text createTextNode(String data)
      Specified by:
      createTextNode in interface Document
    • createComment

      public Comment createComment(String data)
      Specified by:
      createComment in interface Document
    • createNodeIterator

      public DomNodeIterator createNodeIterator(Node root, int whatToShow, NodeFilter filter, boolean entityReferenceExpansion) throws DOMException
      Create a new NodeIterator over the subtree rooted at the specified node.
      Parameters:
      root - The node which will be iterated together with its children. The NodeIterator is initially positioned just before this node. The whatToShow flags and the filter, if any, are not considered when setting this position. The root must not be null.
      whatToShow - This flag specifies which node types may appear in the logical view of the tree presented by the NodeIterator. See the description of NodeFilter for the set of possible SHOW_ values.These flags can be combined using OR.
      filter - The NodeFilter to be used with this NodeIterator, or null to indicate no filter.
      entityReferenceExpansion - The value of this flag determines whether entity reference nodes are expanded.
      Returns:
      The newly created NodeIterator.
      Throws:
      DOMException - NOT_SUPPORTED_ERR: Raised if the specified root is null.
    • getContentType

      public abstract String getContentType()
      Returns the content type of this page.
      Returns:
      the content type of this page
    • clearComputedStyles

      public void clearComputedStyles()
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Clears the computed styles.
    • clearComputedStyles

      public void clearComputedStyles(DomElement element)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Clears the computed styles for a specific Element.
      Parameters:
      element - the element to clear its cache
    • clearComputedStylesUpToRoot

      public void clearComputedStylesUpToRoot(DomElement element)
      INTERNAL API - SUBJECT TO CHANGE AT ANY TIME - USE AT YOUR OWN RISK.
      Clears the computed styles for a specific Element and all parent elements.
      Parameters:
      element - the element to clear its cache
    • isPrinting

      public boolean isPrinting()
      Returns:
      whether or not this is currently printing
    • setPrinting

      public void setPrinting(boolean printing)
      Parameters:
      printing - the printing state to set
    • domChangeListenerAdded

      public void domChangeListenerAdded()
      Informs about the use of a domChangeListener.
    • isDomChangeListenerInUse

      public boolean isDomChangeListenerInUse()
      Returns:
      true if at least one domChangeListener was registered.
    • characterDataChangeListenerAdded

      public void characterDataChangeListenerAdded()
      Informs about the use of a characterDataChangeListener.
    • isCharacterDataChangeListenerInUse

      public boolean isCharacterDataChangeListenerInUse()
      Returns:
      true if at least one characterDataChangeListener was registered.