Class EncodingSniffer


  • public final class EncodingSniffer
    extends java.lang.Object
    Sniffs encoding settings from HTML, XML or other content. The HTML encoding sniffing algorithm is based on the HTML5 encoding sniffing algorithm.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.nio.charset.Charset sniffEncoding​(java.util.List<NameValuePair> headers, java.io.InputStream content)
      If the specified content is HTML content, this method sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.
      static java.nio.charset.Charset sniffEncodingFromHttpHeaders​(java.util.List<NameValuePair> headers)
      Attempts to sniff an encoding from the specified HTTP headers.
      static java.nio.charset.Charset sniffHtmlEncoding​(java.util.List<NameValuePair> headers, java.io.InputStream content)
      Sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.
      static java.nio.charset.Charset sniffUnknownContentTypeEncoding​(java.util.List<NameValuePair> headers, java.io.InputStream content)
      Sniffs encoding settings from the specified content of unknown type by looking for Content-Type information in the HTTP headers and Byte Order Mark information in the content.
      static java.nio.charset.Charset sniffXmlEncoding​(java.util.List<NameValuePair> headers, java.io.InputStream content)
      Sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.
      static java.nio.charset.Charset toCharset​(java.lang.String charsetName)
      Returns Charset if the specified charset name is supported on this platform.
      static java.lang.String translateEncodingLabel​(java.nio.charset.Charset encodingLabel)
      Translates the given encoding label into a normalized form according to Reference.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • sniffEncoding

        public static java.nio.charset.Charset sniffEncoding​(java.util.List<NameValuePair> headers,
                                                             java.io.InputStream content)
                                                      throws java.io.IOException

        If the specified content is HTML content, this method sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.

        If the specified content is XML content, this method sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.

        Otherwise, this method sniffs encoding settings from the specified content of unknown type by looking for Content-Type information in the HTTP headers and Byte Order Mark information in the content.

        Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

        Parameters:
        headers - the HTTP response headers sent back with the content to be sniffed
        content - the content to be sniffed
        Returns:
        the encoding sniffed from the specified content and/or the corresponding HTTP headers, or null if the encoding could not be determined
        Throws:
        java.io.IOException - if an IO error occurs
      • sniffHtmlEncoding

        public static java.nio.charset.Charset sniffHtmlEncoding​(java.util.List<NameValuePair> headers,
                                                                 java.io.InputStream content)
                                                          throws java.io.IOException

        Sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.

        Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

        Parameters:
        headers - the HTTP response headers sent back with the HTML content to be sniffed
        content - the HTML content to be sniffed
        Returns:
        the encoding sniffed from the specified HTML content and/or the corresponding HTTP headers, or null if the encoding could not be determined
        Throws:
        java.io.IOException - if an IO error occurs
      • sniffXmlEncoding

        public static java.nio.charset.Charset sniffXmlEncoding​(java.util.List<NameValuePair> headers,
                                                                java.io.InputStream content)
                                                         throws java.io.IOException

        Sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.

        Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

        Parameters:
        headers - the HTTP response headers sent back with the XML content to be sniffed
        content - the XML content to be sniffed
        Returns:
        the encoding sniffed from the specified XML content and/or the corresponding HTTP headers, or null if the encoding could not be determined
        Throws:
        java.io.IOException - if an IO error occurs
      • sniffUnknownContentTypeEncoding

        public static java.nio.charset.Charset sniffUnknownContentTypeEncoding​(java.util.List<NameValuePair> headers,
                                                                               java.io.InputStream content)
                                                                        throws java.io.IOException

        Sniffs encoding settings from the specified content of unknown type by looking for Content-Type information in the HTTP headers and Byte Order Mark information in the content.

        Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

        Parameters:
        headers - the HTTP response headers sent back with the content to be sniffed
        content - the content to be sniffed
        Returns:
        the encoding sniffed from the specified content and/or the corresponding HTTP headers, or null if the encoding could not be determined
        Throws:
        java.io.IOException - if an IO error occurs
      • sniffEncodingFromHttpHeaders

        public static java.nio.charset.Charset sniffEncodingFromHttpHeaders​(java.util.List<NameValuePair> headers)
        Attempts to sniff an encoding from the specified HTTP headers.
        Parameters:
        headers - the HTTP headers to examine
        Returns:
        the encoding sniffed from the specified HTTP headers, or null if the encoding could not be determined
      • toCharset

        public static java.nio.charset.Charset toCharset​(java.lang.String charsetName)
        Returns Charset if the specified charset name is supported on this platform.
        Parameters:
        charsetName - the charset name to check
        Returns:
        Charset if the specified charset name is supported on this platform
      • translateEncodingLabel

        public static java.lang.String translateEncodingLabel​(java.nio.charset.Charset encodingLabel)
        Translates the given encoding label into a normalized form according to Reference.
        Parameters:
        encodingLabel - the label to translate
        Returns:
        the normalized encoding name or null if not found