
Glenn Enright wrote:
Can anyone help with the relation between assembler command 'movsl' and alignment. All I've been able to find so far is much general stuff in kernel patches about bulk memory moves.
movsl is an assembler command for "move string (long)". It copies 4 (or 8? I forget) bytes from DS:ESI to ES:EDI, and increments ESI and EDI by the number of bytes copied. It's usually used with the "REP" prefix which repeats decrementing ECX each iteration until CX is zero. So you can say "REP MOVSL" to copy lots of memory in one opcode. Ahh, the wonderful bounties of a CISC world. Now, Intel machines don't really care too much about alignment, you can fetch a byte at any address without issue. However the various busses inside your PC usually don't work on a byte by byte manner, but in multiples of a byte, 4 to 16 bytes are common. And when you read memory from an address the bus will read the surrounding bytes and return them too, which will get thrown away if they are unused. So, say I fetch a byte at address 0x02 over a 32bit (4 byte) bus. The bus will return the memory at address 0x00, 0x01, 0x02 and 0x03 at the same time, which all but the memory at 0x02 will all get thrown away. Now, if I fetch 4 bytes over a 32bit bus starting at address 0x02 I'll get two bus transactions, one with "0x00, 0x01, 0x02 and 0x03" and one with "0x04, 0x05, 0x06 and 0x07". Then it'll throw away 0x00, 0x01, 0x06 and 0x07. This is obviously a waste, so if you are trying to read more than one byte quantities, you generally try and align it to the nearest multiple of the size you are trying to read, otherwise you waste half your bus bandwidth. So short answer: Align memory. It's faster that way on Intel machines.