
Windows Notepad has never been able to handle any newline convention other than the old DOS/Windows/CP/M one (CR-LF). Now, after so many decades, Microsoft has finally decided to give it “universal newline” capability, so it can handle lines ending in LF-only (Unix/Linux) and CR-only (old MacOS) <http://www.theregister.co.uk/2018/05/08/windows_notepad_unix_macos_line_endings/>. Gee, I wonder how many lines of code that took...

Windows Notepad has never been able to handle any newline convention other than the old DOS/Windows/CP/M one (CR-LF). Now, after so many decades, Microsoft has finally decided to give it “universal newline” capability, so it can handle lines ending in LF-only (Unix/Linux) and CR-only (old MacOS) <http://www.theregister.co.uk/2018/05/08/windows_notepad_unix_macos_line_endings/>.
Gee, I wonder how many lines of code that took...
The code wasn't the problem... The levels of management to go through to approve changes to a core component of Windows, then the same for applying for budget, approving budget (each time several rounds) and then recruiting the right people to do the job (advertised worldwide, multiple rounds of interviews, visa problems to sort out etc). Once the changes were implemented, testing was performed on randomly drawn user groups, compared, revisited, confirming that the code changes were working as expected and then signing the changes off before they could go into production. Probably cost USD 1,500,000. Conservative estimate. ;-) Cheers, Peter -- Peter Reutemann Dept. of Computer Science University of Waikato, NZ +64 (7) 858-5174 http://www.cms.waikato.ac.nz/~fracpete/ http://www.data-mining.co.nz/

On Wed, 9 May 2018 09:58:44 +1200, Peter Reutemann wrote:
The code wasn't the problem... The levels of management to go through to approve changes to a core component of Windows, then the same for applying for budget, approving budget (each time several rounds) ... [etc etc]
Well, guess what: flushed with that previous success, those daring folks at Microsoft have gone even further <https://arstechnica.com/gadgets/2018/12/latest-windows-insider-build-makes-a-major-upgrade-to-uh-notepad/>: The new and improved Notepad now has better Unicode support, defaulting to saving files as UTF-8 _without_ a Byte Order Mark ... You may or may not know, but “Byte Order Mark” is Unicode character U+FEFF, while the character code with the bytes swapped, U+FFFE, is “unassigned”, and will forever remain so. The usefulness of this pair dates back to the era when Unicode was only 16 bits, so what is now “UTF-16” encoding was equivalent to fixed-length “UCS-2” encoding. You may also know about the “big-endian” versus “little-endian” issue between different processor architectures. So text encoded in UCS-2 or UTF-16 is supposed to begin with a Byte Order Mark, and any program reading that text can check that the first character is indeed u+FEFF. If it sees U+FFFE instead, then it knows that the encoding comes from a machine with the opposite endianness, and can automatically apply a corresponding byte-swap adjustment to the text. Since UCS-2 is no longer sufficient to represent current versions of Unicode, and UTF-16 is a pain to deal with, UTF-8 is considered a much superior encoding. Furthermore, its definition is endianness-independent, so software running on different architectures always agrees about how the bytes are ordered. However, Microsoft in their wisdom decided that their version of UTF-8 text should still begin with a Byte Order Mark (UTF-8-encoded, of course). Which is completely pointless and ends up introducing a garbage character at the start when read by non-Windows software.
participants (2)
-
Lawrence D'Oliveiro
-
Peter Reutemann