Readying a MacBook Pro M2 Max for Tensorflow
I was looking for a development laptop that would let me prototype rather big ML models locally. Life will have me moving across countries in the next months, and I would like to avoid depending completely on an Internet connection to access a cloud service if I'm just testing a code change on GPU. Once I settle, I'll probably build an ML rig, but that's a story for another post.
I ended up getting myself a MacBook Pro M2 Max. Apple silicon is very power-efficient, and, most importantly, its shared memory architecture gives the GPU access to the entire RAM. In my case, that's 96 GB, which should be enough for some decently-sized models.
Making Tensorflow work with Apple silicon can be straightforward... if you know how to. Hopefully, this post will save someone the time I spent troubleshooting.
The problem
According to this Apple Developer guide, you need four things:
- Conda
- Tensorflow dependencies
- Base Tensorflow
- Metal plugin
Conda is the package management system used to install the Tensorflow dependencies. In the Apple Developer guide, Miniconda is the suggested Conda distribution. If you have Conda via Miniconda, or are starting from a clean system and can just install Miniconda, the steps in the Apple Developer guide may just work for you. If you are like me, and —for other reasons— your Conda comes from the Anaconda distribution, you might have encountered the following error training ResNet after following the installation steps in the guide:
tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x2b5874380
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Applications/miniconda3/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/Applications/miniconda3/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error:
Detected at node 'StatefulPartitionedCall_212' defined at (most recent call last):
File "<stdin>", line 1, in <module>
File "/Applications/miniconda3/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
return fn(*args, **kwargs)
File "/Applications/miniconda3/lib/python3.10/site-packages/keras/engine/training.py", line 1650, in fit
tmp_logs = self.train_function(iterator)
[...] <omitted for clarity>
File "/Applications/miniconda3/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var
return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_212'
could not find registered platform with id: 0x2b5874380
[[{{node StatefulPartitionedCall_212}}]] [Op:__inference_train_function_23355]
The error seems to point at a missing specialization of the StatefulPartitionedCall node for our GPU. Where does this problem come from? Of course, package versions! The command lines in the Apple Developer guide install the latest versions of the packages, but —at the time I'm writing this— the latest versions don't play well with each other.
The solution
What package versions should we install, then? The accepted reply to this question gives us a working set of versions for the tensorflow-macos and tensorflow-metal packages: 2.9.0 and 0.5.0, respectively. I found both tensorflow-deps 2.9.0 and 2.10.0 to work well.
Package | Latest version | Working version |
---|---|---|
tensorflow-macos | 2.11.0 | 2.9.0 |
tensorflow-metal | 0.7.0 | 0.5.0 |
tensorflow-deps | 2.10.0 | 2.10.0 or 2.9.0 |
Uninstall the previous versions
These are the command lines to undo the delinquent installation:
pip uninstall tensorflow-metal
pip uninstall tensorflow-macos
conda uninstall tensorflow-deps
Install the working versions
Here are the command lines to install the working versions:
conda install https://conda.anaconda.org/apple/osx-arm64/tensorflow-deps-2.10.0-0.tar.bz2
pip install tensorflow-macos==2.9
pip install tensorflow-metal==0.5.0
One could think that conda install -c apple tensorflow-deps=2.10.0 should be equivalent to the first command line, but I found conda to ignore the passed version and stick to last installed one (2.9.0, in my case). The workaround is finding the version URL with conda search -c apple tensorflow-deps=2.10.0 --info and pass it directly to conda install <package_url> as above. If you know a better solution to this, please share it in the comments!