Bush hid the facts (sometimes also this app can break) is the common name for a bug present in the function IsTextUnicode of Windows NT 3.5 and its successors, including Windows XP, which causes a file of text encoded in Windows-1252 or similar encoding to be interpreted by applications that use it (such as Notepad) as if it were UTF-16LE, resulting in mojibake. When "bush hid the facts" is put in a new Notepad document and saved, closed, and reopened, the words "畢桳栠摩琠敨映捡獴" appear instead. While "Bush hid the facts" is the sentence that is most commonly presented on the Internet to induce the error, it does not exclusively occur with that phrase. The bug can be triggered by many sentences with alphabetic characters and spaces in a particular order (4-space-3-space-3-space-5), (4-space-5-space-3-space-5), and (1-space-4-space-3-space-3) as well as other combinations that can be parsed into valid (if nonsensical) Chinese characters in Unicode. The bug occurs when the ANSI string is passed to the Win32 charset detection function IsTextUnicode with no other characters. Because of this bug, IsTextUnicode will return TRUE, which means that applications that uses it will incorrectly interpret it as UTF-16LE. For example, if you load a text file with the string into a text editor that uses IsTextUnicode, the text will be displayed as nine {4-3-3-5}, ten {4-5-3-5}, or seven {1-4-3-3} Chinese characters—or rectangles if the language pack has not been installed. To retrieve the original text using Notepad, bring up the "Open a file" dialog box, select the file, select ANSI in the "Encoding" list box, and click Open.
1 comment:
Bush hid the facts (sometimes also this app can break) is the common name for
a bug present in the function IsTextUnicode of Windows NT 3.5 and its
successors, including Windows XP, which causes a file of text encoded
in Windows-1252 or similar encoding to be interpreted by applications that use
it (such as Notepad) as if it were UTF-16LE, resulting in mojibake. When "bush
hid the facts" is put in a new Notepad document and saved, closed, and
reopened, the words "畢桳栠摩琠敨映捡獴" appear instead.
While "Bush hid the facts" is the sentence that is most commonly presented on
the Internet to induce the error, it does not exclusively occur with that
phrase. The bug can be triggered by many sentences with alphabetic characters
and spaces in a particular order (4-space-3-space-3-space-5),
(4-space-5-space-3-space-5), and (1-space-4-space-3-space-3) as well as other
combinations that can be parsed into valid (if nonsensical) Chinese characters
in Unicode.
The bug occurs when the ANSI string is passed to the Win32 charset
detection function IsTextUnicode with no other characters. Because of this bug,
IsTextUnicode will return TRUE, which means that applications that uses it will
incorrectly interpret it as UTF-16LE. For example, if you load a text file with
the string into a text editor that uses IsTextUnicode, the text will be
displayed as nine {4-3-3-5}, ten {4-5-3-5}, or seven {1-4-3-3} Chinese
characters—or rectangles if the language pack has not been installed. To
retrieve the original text using Notepad, bring up the "Open a file" dialog
box, select the file, select ANSI in the "Encoding" list box, and click Open.
Post a Comment