Great! Now I understand why my predominately Fortran code with some small C additions gets fortran-part in arm64 architecture and c-part in x86_6. Without your hint I would never figure it out.
If you look in the next blog, you can see the level of support that is promised (Neon), so no SVE yet. The Anandtech article referenced in my comment there also discusses the u-architecture (including FP-pipes).
My interest is more in data-movement than FP, because most machines are over-provisioned with FP, but not with memory bandwidth and have long latencies for inter-cache data movement (which affects things like fork/join barriers and locks).
Very true that there is never enough memory bandwidth. Atomics tend to have lower latency on Arm, which is a plus, possibly because there is less legacy commitment in the design. My interest in SIMD is mostly integer for parsing and data compression.
Regarding
# I don't see how to find this programmatically :-(
XCODE_ROOT=
You can get this value with `xcrun --show-sdk-path`, optionally including `--sdk macosx`
Great! Now I understand why my predominately Fortran code with some small C additions gets fortran-part in arm64 architecture and c-part in x86_6. Without your hint I would never figure it out.
A nice article. I really like the AArch64 design and would be interested to know how good the SIMD support is on the M1.
If you look in the next blog, you can see the level of support that is promised (Neon), so no SVE yet. The Anandtech article referenced in my comment there also discusses the u-architecture (including FP-pipes).
My interest is more in data-movement than FP, because most machines are over-provisioned with FP, but not with memory bandwidth and have long latencies for inter-cache data movement (which affects things like fork/join barriers and locks).
Very true that there is never enough memory bandwidth. Atomics tend to have lower latency on Arm, which is a plus, possibly because there is less legacy commitment in the design. My interest in SIMD is mostly integer for parsing and data compression.