Intel chips have always been slow at various things, they've always just
done better with clock cycles. A memorable example of this was the LOOP
opcode on pentium chips. It was very slow on Intel machines, so
programs used it as a timing delay. On AMD chips it was extremely fast
(1 cycle IIRC), so those timing loops effectively became noops causing
lots of programs to fail in amusing and/or spectacular fashion.
lol theres good reason for not using things for purposes other than intended
sometimes :) Would've loved to see the expresions on the faces of the guys at
intel as they came across stuff like that. Still every choice has its
tradeoffs.