Hi All,
I find this fascinating, and I'll profile as soon as I can. I have two MacBook Pros, a 2018 2.9GHz Intel Core i9, and a 2021 Apple M1 Max. On large loads the latter is about -20% faster than the former. I know this because Tim found a bug in the become primitive with jitted methods that manifested on the Linux ARMv8 (DUAL_MAPPED_CODE_ZONE regime). SImulating the re3co9mpilation of all methods in the system that Tim's example did took about 5 hours on the x86_64 and only 4 on the M1 Max. But if I compare the JIT benchmarks on them the speeds swing wildly the other way:
These are the average times to JIT all of COmpiledCode's methods (excluding subclasses) Apple M1: min max average 2.417 79.257 9.674 usecs to JIT 2 20 5.938 number of literals 1 106 17.464 number of bytecodes
Apple 2.9GHz Core i9: min max average 0.705 32.497 4.256 usecs to JIT 2 20 5.938 number of literals 1 106 17.464 number of bytecodes
So there must be some really inefficient thing that the Apple ARMv8 JIT is doing to make it so slow. Interesting :-)
_,,,^..^,,,_ best, Eliot
Well, obviously it's doing the job once for each mapped zone :-)
On 2023-01-16, at 1:01 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
Hi All,
I find this fascinating, and I'll profile as soon as I can. I have two MacBook Pros, a 2018 2.9GHz Intel Core i9, and a 2021 Apple M1 Max. On large loads the latter is about -20% faster than the former. I know this because Tim found a bug in the become primitive with jitted methods that manifested on the Linux ARMv8 (DUAL_MAPPED_CODE_ZONE regime). SImulating the re3co9mpilation of all methods in the system that Tim's example did took about 5 hours on the x86_64 and only 4 on the M1 Max. But if I compare the JIT benchmarks on them the speeds swing wildly the other way:
These are the average times to JIT all of COmpiledCode's methods (excluding subclasses) Apple M1: min max average 2.417 79.257 9.674 usecs to JIT 2 20 5.938 number of literals 1 106 17.464 number of bytecodes
Apple 2.9GHz Core i9: min max average 0.705 32.497 4.256 usecs to JIT 2 20 5.938 number of literals 1 106 17.464 number of bytecodes
So there must be some really inefficient thing that the Apple ARMv8 JIT is doing to make it so slow. Interesting :-)
_,,,^..^,,,_ best, Eliot
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Hypnotized as a child and couldn't be woken.
vm-dev@lists.squeakfoundation.org