For faster navigation, this Iframe is preloading the Wikiwand page for Content sniffing.

Content sniffing

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: "Content sniffing" – news · newspapers · books · scholar · JSTOR (January 2024) (Learn how and when to remove this message)

Content sniffing, also known as media type sniffing or MIME sniffing, is the practice of inspecting the content of a byte stream to attempt to deduce the file format of the data within it. Content sniffing is generally used to compensate for a lack of accurate metadata that would otherwise be required to enable the file to be interpreted correctly. Content sniffing techniques tend to use a mixture of techniques that rely on the redundancy found in most file formats: looking for file signatures and magic numbers, and heuristics including searching for well-known representative substrings, the use of byte frequency and n-gram tables, and Bayesian inference.

Multipurpose Internet Mail Extensions (MIME) sniffing was, and still is, used by some web browsers, including notably Microsoft's Internet Explorer, in an attempt to help web sites which do not correctly signal the MIME type of web content display.[1] However, doing this opens up a serious security vulnerability,[2] in which, by confusing the MIME sniffing algorithm, the browser can be manipulated into interpreting data in a way that allows an attacker to carry out operations that are not expected by either the site operator or user, such as cross-site scripting.[3] Moreover, by making sites which do not correctly assign MIME types to content appear to work correctly in those browsers, it fails to encourage the correct labeling of material, which in turn makes content sniffing necessary for these sites to work, creating a vicious circle of incompatibility with web standards and security best practices.

A specification exists for media type sniffing in HTML5, which attempts to balance the requirements of security with the need for reverse compatibility with web content with missing or incorrect MIME-type data. It attempts to provide a precise specification that can be used across implementations to implement a single well-defined and deterministic set of behaviors.[4]

The UNIX file command can be viewed as a content sniffing application.

Charset sniffing

[edit]

Numerous web browsers use a more limited form of content sniffing to attempt to determine the character encoding of text files for which the MIME type is already known. This technique is known as charset sniffing or codepage sniffing and, for certain encodings, may be used to bypass security restrictions too. For instance, Internet Explorer 7 may be tricked to run JScript in circumvention of its policy by allowing the browser to guess that an HTML-file was encoded in UTF-7.[5] This bug is worsened by the feature of the UTF-7 encoding which permits multiple encodings of the same text and, specifically, alternative representations of ASCII characters.

Most encodings do not allow evasive presentations of ASCII characters, so charset sniffing is less dangerous in general because, due to the historical accident of the ASCII-centric nature of scripting and markup languages, characters outside the ASCII repertoire are more difficult to use to circumvent security boundaries, and misinterpretations of character sets tend to produce results no worse than the display of mojibake.

See also

[edit]

References

[edit]
  1. ^ "MIME Type Detection in Windows Internet Explorer". Microsoft. Retrieved 2012-07-14.
  2. ^ Barth, Adam. "Secure Content Sniffing for Web Browsers, or How to Stop Papers from Reviewing Themselves" (PDF).
  3. ^ Henry Sudhof (11 February 2009). "Risky sniffing: MIME sniffing in Internet Explorer enables cross-site scripting attacks". The H. Retrieved 2012-07-14.
  4. ^ Adam Barth, Ian Hickson. "Mime Sniffing". WHATWG. Retrieved 2012-07-14.
  5. ^ "Event 1058 - Codepage Sniffing". Internet Explorer. MSDN. Retrieved 2012-07-14.
[edit]
{{bottomLinkPreText}} {{bottomLinkText}}
Content sniffing
Listen to this article

This browser is not supported by Wikiwand :(
Wikiwand requires a browser with modern capabilities in order to provide you with the best reading experience.
Please download and use one of the following browsers:

This article was just edited, click to reload
This article has been deleted on Wikipedia (Why?)

Back to homepage

Please click Add in the dialog above
Please click Allow in the top-left corner,
then click Install Now in the dialog
Please click Open in the download dialog,
then click Install
Please click the "Downloads" icon in the Safari toolbar, open the first download in the list,
then click Install
{{::$root.activation.text}}

Install Wikiwand

Install on Chrome Install on Firefox
Don't forget to rate us

Tell your friends about Wikiwand!

Gmail Facebook Twitter Link

Enjoying Wikiwand?

Tell your friends and spread the love:
Share on Gmail Share on Facebook Share on Twitter Share on Buffer

Our magic isn't perfect

You can help our automatic cover photo selection by reporting an unsuitable photo.

This photo is visually disturbing This photo is not a good choice

Thank you for helping!


Your input will affect cover photo selection, along with input from other users.

X

Get ready for Wikiwand 2.0 ๐ŸŽ‰! the new version arrives on September 1st! Don't want to wait?