Endian Wars: Linus Has Spoken

The Rust language continues to make progress for use in the Linux kernel <https://www.theregister.com/2021/07/05/rust_for_linux_kernel_project/>. The article ends with a comment from Linus Torvalds about the old “endian wars” which have been a feature of the computing landscape since byte addressability became common: should the bytes of a number be arranged in big-endian order (most significant end at the lowest address) or little-endian (least significant end at the lowest address)? In the latest patches, Ojeda proposed to use big-endian order for long symbols in the kernel symbol table, needed because Rust symbols "can become quite long due to namespacing introduced by modules, types, traits, generics." Torvalds responded: "Why is this in big-endian order? Let's just try to kill big-endian data, it's disgusting and should just die already... networking has legacy reasons from the bad old days when byte order wars were still a thing, but those days are gone." Having used both kinds of computer architectures over the years, I agree. Little-endian ordering is more mathematically consistent.

On Tue, Jul 06, 2021 at 01:00:15PM +1200, Lawrence D'Oliveiro wrote:
Having used both kinds of computer architectures over the years, I agree. Little-endian ordering is more mathematically consistent.
What do you mean? They are both mathematically consistent. If one was not mathematically consistent it would not have even been used. Little-endian does have this advantage: If you place different length words at the same memory address, the low bytes of the shorter word lands at the same address as the low bytes of the longer word. This does not happen with big-endian. So, for example, you extend a 16-bit word at an address to a 32-bit word at the same address then on a big-endian system you have to shift the bytes that comprise the 16-bit word to a different memory address to construct the 32-bit word. That is not necessary on a little-endian system. Cheers, Michael.

On Tue, 6 Jul 2021 14:05:49 +1200, Michael Cree wrote:
On Tue, Jul 06, 2021 at 01:00:15PM +1200, Lawrence D'Oliveiro wrote:
Having used both kinds of computer architectures over the years, I agree. Little-endian ordering is more mathematically consistent.
What do you mean? They are both mathematically consistent. If one was not mathematically consistent it would not have even been used.
Big-endian was mainly used because it was easier to get data dumps to match reading order, which might have been helpful for debugging. Consider the numbers assigned to three different things in a multibyte object: the bytes within the object, the bits within the byte/object, and the weightings of the bits as digits in a binary integer (bit 0 represents 2**0, bit 1 represents 2**1 etc). Only in little-endian do you have a straightforward relationship among all three: in big-endian layouts, it is unavoidable that at least one of the three ends up the opposite way to the others. And different big-endian architectures make different decisions about the ordering, sometimes even within different generations of the same architecture. For example, in the original Motorola 68000 instruction set, the single-bit insertion/extraction instructions numbered the bits the same way as the digit weightings; but with the multibit field insertion/extraction instructions added in the 32-bit 68020 and later processors, the numbering was the opposite way! In the IBM POWER architecture, “bit 0” was the designation for the most significant bit of an integer, not the least significant bit.
Little-endian does have this advantage: If you place different length words at the same memory address, the low bytes of the shorter word lands at the same address as the low bytes of the longer word. This does not happen with big-endian.
Even in big-endian architectures, registers tend to behave as though they are little-endian. For example, consider whatever the instruction-set-specific equivalent of this sequence might be: move-32-bits A to B move-8-bits B to C where either B is a register and A and C are in main memory (so the first move is a load and the second one is a store), or the other way round. Does C end up with the most or least significant 8 bits of the 32-bit quantity from A?

On Tue, Jul 06, 2021 at 02:29:51PM +1200, Lawrence D'Oliveiro wrote:
On Tue, 6 Jul 2021 14:05:49 +1200, Michael Cree wrote:
On Tue, Jul 06, 2021 at 01:00:15PM +1200, Lawrence D'Oliveiro wrote:
Having used both kinds of computer architectures over the years, I agree. Little-endian ordering is more mathematically consistent.
What do you mean? They are both mathematically consistent. If one was not mathematically consistent it would not have even been used.
Big-endian was mainly used because it was easier to get data dumps to match reading order, which might have been helpful for debugging.
Consider the numbers assigned to three different things in a multibyte object: the bytes within the object, the bits within the byte/object, and the weightings of the bits as digits in a binary integer (bit 0 represents 2**0, bit 1 represents 2**1 etc). Only in little-endian do you have a straightforward relationship among all three: in big-endian layouts, it is unavoidable that at least one of the three ends up the opposite way to the others.
Which is what I was explaining in the response I gave and you snipped from your reply. I agree that little-endian is much more sensible when extending or truncating integer numbers. My point is that the above, and the other things you mentioned further in your email, do not amount to mathematical inconsistency. Indeed, I would even query the claim that anything can be "more" mathematically consistent, because it is either mathematically consistent or it is not. Cheers Michael.

What makes me laugh about this is how the US still clings to "middle-endian" mm/dd/yyyy date format, the cause of way too many software bugs. ;) On Tue, 6 Jul 2021 at 16:26, Lawrence D'Oliveiro <ldo(a)geek-central.gen.nz> wrote:
On Tue, 6 Jul 2021 16:11:08 +1200, Michael Cree wrote:
Indeed, I would even query the claim that anything can be "more" mathematically consistent, because it is either mathematically consistent or it is not.
Unless the issue is undecidable ... _______________________________________________ wlug mailing list -- wlug(a)list.waikato.ac.nz | To unsubscribe send an email to wlug-leave(a)list.waikato.ac.nz Unsubscribe: https://list.waikato.ac.nz/postorius/lists/wlug.list.waikato.ac.nz

On Tue, Jul 06, 2021 at 04:26:29PM +1200, Lawrence D'Oliveiro wrote:
On Tue, 6 Jul 2021 16:11:08 +1200, Michael Cree wrote:
Indeed, I would even query the claim that anything can be "more" mathematically consistent, because it is either mathematically consistent or it is not.
Unless the issue is undecidable ...
Even if it is undecidable does not imply that it is neither consistent nor inconsistent. It just means we cannot prove which it is. I do find it fascinating that mathematics demonstrates such a limitation on knowledge! Cheers Michael.
participants (3)
-
David McNab
-
Lawrence D'Oliveiro
-
Michael Cree