Setting up the Apple M1 for Native Code Development from the Command Line
Finding the right PATH
Introduction
This is the first of a few posts on my experiences using a Mac mini/M1 for software development. Since I am an oldie, I am doing this from the command line. If you are using a GUI that may make things easier, or, (at least to me), more confusing!
This post covers the apparently simple task of having your environment set up in a way that targets the M1 sanely.
Later posts will cover some “gotchas” in the Arm and emulated x86_64 environments.
The M1 Environment
The Apple M1 is an Apple implementation of an Arm 64-bit (AArch64) architecture processor. To allow machines built with this processor to run code which was built for the x86_64 architecture, MacOS (in the “Big Sur” and later releases) supplies an x86_64 binary translation emulator (“Rosetta-2”). The presence of this emulator means that x86_64 native executables can run on this machine without any change. That is a good thing since it means we have a useful environment without changing or recompiling any code, but a bad one in terms of potential confusion.
To allow a single executable image (or runtime library) to run natively on either architecture, the OS (and associated tools) support “universal-binaries”; thus a single file can contain both the code required for the AArch64 architecture and also that required for the x86_64 one. In general, a “fat binary” like this could support many more architectures, but the case of interest here is just these two.
Apple utilities, such as the compilers that come with the XCode command line tools, are distributed in this way, so a single executable image found in your $PATH
may contain both versions of the tool.
Aside from that (which we’ll see below can be confusing!) the environment is the normal, MacOS “Big Sur” one.
What Works?
At first glance, everything. The installation of Aquamacs runs, bash
, python
, make
and cmake
are all there, so things look great.
But…
Things get confusing if you run a shell from inside emacs
(as I do, since it provides a stable development environment on all of your machines, accessible without needing to forward a GUI, and my muscle-memory knows the editing keystrokes).
Let’s check the compiler we see in that environment :-
$ which clang
/Library/Developer/CommandLineTools/usr/bin/clang
$ clang -v
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target:
x86_64
-apple-darwin20.3.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Hmm, we’re getting the compiler from the XCode tools, but it’s compiling for x86_64, not the AArch64 that we wanted.
What Is Going On?
The arch command
gives us a little hint. If we run it, it tells us the environment in which it is running :-
$ arch
i386
So our shell (executing inside emacs
) is running in emulation. The “i386” is somewhat misleading; it does mean x86_64! 1
If we do the same thing in a shell inside a terminal window started from Finder
, we see this:-
$ arch
arm64
So, here our shell is running natively.
But, Why Does This Matter?
To make things work more easily for the x86_64 emulated environment, the standard behaviour of MacOS is to maintain the existing process’ architecture at an exec
call if possible. Thus, if our shell is running in x86_64 emulation, when we start the compiler (which is a universal binary, remember), the embedded x86_64 executable will be executed. And, that version of the compiler defaults to compiling for the x86_64 target.
So… because we used emacs
(which is not yet native), we end up with an x86_64 compiler.
Totally obvious, right!?
But, before you blame emacs
, consider that other tools may also not yet be available in AArch64 native form, and that even when they are you need to set up your environment to ensure that you find the right one. Consider cmake
, for instance; if you are using brew
and haven’t installed the AArch64 native version and changed your paths, you’ll be getting the cmake
from /usr/local/bin
which is an x86_64 image. It will therefore see that as the default target environment and likely configure the build it is setting up to use that architecture.
What Can We Do?
There are a number of things that we can do that help.
Ensure that our shell always executes as a native, AArch64 shell
We can achieve that by checking the architecture in which the shell is running in our shell startup script (.zshenv
or .bashrc,
or whatever is appropriate for your shell), and then exec-ing another shell inside an arch
command to switch to the AArch64 environment.
Something like this from my .bashrc
:-
# Switch to an arm64e shell by default
if [ `machine` != arm64e ]; then
echo 'Execing arm64 shell'
exec arch -arm64 bash
fi
Once that is in place, the shell executed inside emacs
will be running the AArch64 environment even though emacs
isn’t. So now the compiler we get will be the one for the real machine.
Ensure that our PATH points to AArch64 images
This is important once we install other tools with brew
. If we use it to install compilers (for instance if we want more cutting edge LLVM compilers, or support for OpenMP which is not fully enabled in the Apple compilers), then we may have three different versions of the clang
command:
A version in the Apple command line tools installed somewhere like
/Library/Developer/CommandLineTools/usr/bin/
. This compiler is a universal-binary which will choose its default target based on the properties of the process from which it was invoked.A version in the x86_64
brew
environment installed by default somewhere like/usr/local/Cellar/llvm/11.0.0_1/bin/
. This compiler is an x86_64 binary targeting x86_64.A version in the AArch64
brew
environment installed by default somewhere like/opt/homebrew/Cellar/llvm/11.0.1/bin/
. This compiler is an aarch64 executable targeting aarch64.
If you copied your existing environment from another, older, MacOS machine, then it will certainly not be pointing at the AArch64 brew
environment. That older, x86_64, installation will all continue to work, but you won’t be exploiting your new machine to its full extent.
Note too, that if the brew
(or other installation systems like anaconda
) directories are searched before the XCode ones, you may be running x86_64 versions of tools which are available as universal binaries (so can run natively) in Xcode. For instance, XCode provides python3
as a universal binary, while anaconda
may not yet be doing that.
$ lipo -info \ /Library/Developer/CommandLineTools/usr/bin/python3
Architectures in the fat file: /Library/Developer/CommandLineTools/usr/bin/python3 are: x86_64 arm64
$ /Library/Developer/CommandLineTools/usr/bin/python3 --version
Python 3.8.2
Useful Commands
I’m not going to show the man
pages for these (and my low Google-fu failed to find good MacOS man pages online, so I won’t give you any links). However, it’s worth knowing about these commands, so that you can then run man
on them yourself locally.
When you’re trying to understand what went wrong, these can be useful!
arch
The arch
command can be used both to see what the current default execution environment is, and to invoke a command for a specific architecture.
$ arch # Show the architecture
i386
$ arch -arch arm64 machine # Run the machine command in arm64
arm64e
$ arch -arch x86_64 machine # Run the machine command in x86_64
i486
You can also set a default machine preference for the arch
command using the ARCHPREFERENCE
envirable. If an invocation of arch
which is being used to invoke another command and there is no architecture being explicitly requested, then the one in $ARCHPREFERENCE
will be used.
$ arch
arm64
$ ARCHPREFERENCE=x86_64 arch
arm64
$ ARCHPREFERENCE=x86_64 arch machine
i486
Note, though, that this is not changing any global default, merely affecting what the arch
command does by default. So execution which isn’t mediated by arch
is not affected.
machine
The machine
command simply prints out the architecture on which it is running. It gives a slightly saner version than arch
does! (Though it still thinks x86_64 is i486, as we can see above). Note, though, that although this returns arm64e on the M1, the target architecture that one should normally use is arm64.
brew
Homebrew now has support for the Aarch64 architecture and will install useful packages for software development. However you will need to change your PATH
to use them, since it does not install them in /usr/local
by default (which is retained for the x86_64 binaries) but rather in /opt/homebrew
. See the Homebrew 3.0.0 release notes.
file
As you no doubt know already, file
is the command one normally uses to see what the semantic properties of a file are. In the MacOS environment, where universal binaries exist, it can tell us about them in a slightly more verbose manner that lipo
’s -info
option.
$ file `which bash`
/bin/bash: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e] /bin/bash (for architecture x86_64): Mach-O 64-bit executable x86_64 /bin/bash (for architecture arm64e): Mach-O 64-bit executable arm64e
lipo
lipo
is the command line tool which is used to manipulate universal binaries. It can show information about the contents, e.g.
$ lipo -info `which bash`
Architectures in the fat file: /bin/bash are: x86_64 arm64e
However, in addition, it can be used to extract architecture specific of the file, or to build a universal binary from existing, single architecture, binaries. Thus, if you wanted to build your own universal binary you will probably need lipo
.
xcode-select
The xcode-select
command can be used to install the XCode command line tools
$ xcode-select --install
or show where the XCode command line tools are installed.
$ xcode-select -p
/Applications/Xcode.app/Contents/Developer
However, the man page suggests that one should use xcrun
to find tools inside a script.
xcrun
xcrun
also provides information about the XCode environment. It is the recommended way to find XCode tools when configuring other things, as I do here in my script to configure LLVM from inside build directory inside the LLVM root directory.
# Xcode, Ninja, Make as you prefer.
BUILD_SYSTEM=Ninja
BUILD_TAG=`echo $BUILD_SYSTEM | tr [A-Z] [a-z]`
INSTALLDIR=${HOME}/software/clang-12.0.0/arm64
# I don't see how to find this programmatically :-(
XCODE_ROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk
cmake ../llvm \
-G${BUILD_SYSTEM} -B ${BUILD_TAG}_build \
-DCMAKE_OSX_ARCHITECTURES='arm64' \
-DCMAKE_C_COMPILER="$(xcrun --find clang)" \
-DCMAKE_CXX_COMPILER="$(xcrun --find clang++)" \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=$INSTALLDIR \
-DLLVM_LOCAL_RPATH=$INSTALLDIR/lib \
-DLLVM_ENABLE_WERROR=FALSE \
-DLLVM_TARGETS_TO_BUILD='AArch64' \
-DLLVM_DEFAULT_TARGET_TRIPLE='aarch64-apple-darwin20.3.0' \
-DDEFAULT_SYSROOT=${XCODE_ROOT} \
-DLLVM_ENABLE_PROJECTS='clang;openmp;polly;clang-tools-extra;libcxx;libcxxabi'
What Did We Just Learn?
This environment can be complicated and hard to grok.
There are tools which can help, some of which are MacOS specific, so you may not even know they exist if you’re used to a Linux environment.
There are some horrible hacks (such as re-execing a shell) which can help too.
You will need to modify your environment (and in particular the directories searched in your
PATH
) to make things work well.You can get this all to work.
It has got much easier now that
brew
has AArch64 MacOS support.
What’s Coming Next
In the next blog I’ll cover some architectural and ABI properties of the M1/MacOS combination which may bite you if you don’t know about them!
The arch
man page is amusing; it references the arm64
target (whose architecture was announced in October 2011) while simultaneously claiming that the man
page dates from July 8 2010!
Regarding
# I don't see how to find this programmatically :-(
XCODE_ROOT=
You can get this value with `xcrun --show-sdk-path`, optionally including `--sdk macosx`
Great! Now I understand why my predominately Fortran code with some small C additions gets fortran-part in arm64 architecture and c-part in x86_6. Without your hint I would never figure it out.