Class EncodingSniffer
- java.lang.Object
-
- org.htmlunit.util.EncodingSniffer
-
public final class EncodingSniffer extends java.lang.Object
Sniffs encoding settings from HTML, XML or other content. The HTML encoding sniffing algorithm is based on the HTML5 encoding sniffing algorithm.
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.nio.charset.Charset
sniffEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content)
If the specified content is HTML content, this method sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.static java.nio.charset.Charset
sniffEncodingFromHttpHeaders(java.util.List<NameValuePair> headers)
Attempts to sniff an encoding from the specified HTTP headers.static java.nio.charset.Charset
sniffHtmlEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content)
Sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.static java.nio.charset.Charset
sniffUnknownContentTypeEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content)
Sniffs encoding settings from the specified content of unknown type by looking forContent-Type
information in the HTTP headers and Byte Order Mark information in the content.static java.nio.charset.Charset
sniffXmlEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content)
Sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.static java.nio.charset.Charset
toCharset(java.lang.String charsetName)
ReturnsCharset
if the specified charset name is supported on this platform.static java.lang.String
translateEncodingLabel(java.nio.charset.Charset encodingLabel)
Translates the given encoding label into a normalized form according to Reference.
-
-
-
Method Detail
-
sniffEncoding
public static java.nio.charset.Charset sniffEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content) throws java.io.IOException
If the specified content is HTML content, this method sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.
If the specified content is XML content, this method sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.
Otherwise, this method sniffs encoding settings from the specified content of unknown type by looking for
Content-Type
information in the HTTP headers and Byte Order Mark information in the content.Note that if an encoding is found but it is not supported on the current platform, this method returns
null
, as if no encoding had been found.- Parameters:
headers
- the HTTP response headers sent back with the content to be sniffedcontent
- the content to be sniffed- Returns:
- the encoding sniffed from the specified content and/or the corresponding HTTP headers,
or
null
if the encoding could not be determined - Throws:
java.io.IOException
- if an IO error occurs
-
sniffHtmlEncoding
public static java.nio.charset.Charset sniffHtmlEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content) throws java.io.IOException
Sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.
Note that if an encoding is found but it is not supported on the current platform, this method returns
null
, as if no encoding had been found.- Parameters:
headers
- the HTTP response headers sent back with the HTML content to be sniffedcontent
- the HTML content to be sniffed- Returns:
- the encoding sniffed from the specified HTML content and/or the corresponding HTTP headers,
or
null
if the encoding could not be determined - Throws:
java.io.IOException
- if an IO error occurs
-
sniffXmlEncoding
public static java.nio.charset.Charset sniffXmlEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content) throws java.io.IOException
Sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.
Note that if an encoding is found but it is not supported on the current platform, this method returns
null
, as if no encoding had been found.- Parameters:
headers
- the HTTP response headers sent back with the XML content to be sniffedcontent
- the XML content to be sniffed- Returns:
- the encoding sniffed from the specified XML content and/or the corresponding HTTP headers,
or
null
if the encoding could not be determined - Throws:
java.io.IOException
- if an IO error occurs
-
sniffUnknownContentTypeEncoding
public static java.nio.charset.Charset sniffUnknownContentTypeEncoding(java.util.List<NameValuePair> headers, java.io.InputStream content) throws java.io.IOException
Sniffs encoding settings from the specified content of unknown type by looking for
Content-Type
information in the HTTP headers and Byte Order Mark information in the content.Note that if an encoding is found but it is not supported on the current platform, this method returns
null
, as if no encoding had been found.- Parameters:
headers
- the HTTP response headers sent back with the content to be sniffedcontent
- the content to be sniffed- Returns:
- the encoding sniffed from the specified content and/or the corresponding HTTP headers,
or
null
if the encoding could not be determined - Throws:
java.io.IOException
- if an IO error occurs
-
sniffEncodingFromHttpHeaders
public static java.nio.charset.Charset sniffEncodingFromHttpHeaders(java.util.List<NameValuePair> headers)
Attempts to sniff an encoding from the specified HTTP headers.- Parameters:
headers
- the HTTP headers to examine- Returns:
- the encoding sniffed from the specified HTTP headers, or
null
if the encoding could not be determined
-
toCharset
public static java.nio.charset.Charset toCharset(java.lang.String charsetName)
ReturnsCharset
if the specified charset name is supported on this platform.- Parameters:
charsetName
- the charset name to check- Returns:
Charset
if the specified charset name is supported on this platform
-
translateEncodingLabel
public static java.lang.String translateEncodingLabel(java.nio.charset.Charset encodingLabel)
Translates the given encoding label into a normalized form according to Reference.- Parameters:
encodingLabel
- the label to translate- Returns:
- the normalized encoding name or null if not found
-
-