aboutsummarylogtreecommitdiffstats
diff options
context:
space:
mode:
authorMark Peschel2023-08-18 11:21:52 -0400
committerGitHub2023-08-18 11:21:52 -0400
commit61390636986f3a38ccb7a98e3f4573da77554c4f (patch)
tree3d453d47a88182d2d6cd4b446c39295b2e4577d2
parent6a3bf90d8e6a9e2ee25bbaf622f559ba2f4960e8 (diff)
downloadaur-61390636986f3a38ccb7a98e3f4573da77554c4f.tar.gz
README 2.12 -> 2.13. Clarity, typos, and safer instructions.
Replace 2.12 with 2.13 and update links appropriately. Add link to Google's tensorflow/tensorflow. Move "breadcrumbs" section to "how to fix this pkgbuild" section since that's when people will need it. git is effectively a build dependency. Suggest --syncdeps for n00bs. Remove suggestion to update "_known_good_commit" since that might break our patches. Move (borked) 2.14 instructions to end and add warnings, libxcrypt-compat dependency. Replace dd commands with tee cuz I guess it's cleaner. Misc. typos and clarity fixes.
-rw-r--r--README.md83
1 files changed, 52 insertions, 31 deletions
diff --git a/README.md b/README.md
index 5511cff7693e..2c96d606670c 100644
--- a/README.md
+++ b/README.md
@@ -1,40 +1,20 @@
# tensorflow-amd-git
-This repository is the PKGBUILD, patches, and test scripts for building the `tensorflow-amd-git` and `python-tensorflow-amd-git` packages on Arch Linux.
-Unlike the `tensorflow-rocm` series of packages, which pull from the official tensorflow/tensorflow repo,
+This repository contains the PKGBUILD, patches, and test scripts for building the `tensorflow-amd-git` and `python-tensorflow-amd-git` [AUR packages](https://aur.archlinux.org/packages/tensorflow-amd-git) for Arch Linux.
+Unlike the `tensorflow-rocm` series of packages, which pull from [Google's tensorflow/tensorflow repo](https://github.com/tensorflow/tensorflow/),
this package pulls directly from [AMD's upstream fork.](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/)
-Specifically, as of 2023-07-04 it draws on the "r2.12-rocm-enhanced" branch on the commit of [the last successful build.](http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/)
-If that link is dead, you may be able to find an equivalent by following these breadcrumbs:
-* [tensorflow/tensorflow](https://github.com/tensorflow/tensorflow/) ->
-* [Community Supported TensorFlow Builds](https://github.com/tensorflow/build#community-supported-tensorflow-builds) ->
-* [Linux AMD ROCm GPU Stable : TF 2.x Build Status Release 2.12](http://ml-ci.amd.com:21096/job/tensorflow/job/nightly-rocmfork-develop-upstream/job/nightly-build-whl/lastSuccessfulBuild/)
+Specifically, as of 2023-08-18 it draws on the "r2.13-rocm-enhanced" branch on the commit of [the last successful build.](http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r213-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/)
## Build instructions
```sh
-sudo pacman -S base-devel
+sudo pacman -S base-devel git
git clone https://aur.archlinux.org/packages/tensorflow-amd-git
cd tensorflow-amd-git
-# At this point, you may want to open PKGBUILD and udpate _known_good_commit.
-makepkg
+makepkg --syncdeps
sudo pacman -U tensorflow*.zst python-tensorflow*.zst
```
-## Tensorflow 2.14
-You can build TensorFlow 2.14 from the unstable "develop-upstream" branch with two changes: update the "last good commit" hash, and change the branch of the git source link.
-You can find a good commit hash from [AMD's Jenkins CI server.](http://ml-ci.amd.com:21096/job/tensorflow/job/nightly-rocmfork-develop-upstream/job/nightly-build-whl/lastSuccessfulBuild/)
-Then update the PKBUILD like:
-```sh
-...
-#_known_good_commit=de8086e14ae3152906e1137c212d2f7bb8ea463a
- _known_good_commit=1d35245a829159ef76b3a403d704a78dcd672bbf
-...
-#source=('tensorflow-upstream-rocm::git+https://github.com/ROCmSoftwarePlatform/tensorflow-upstream#branch=r2.12-rocm-enhanced'
- source=('tensorflow-upstream-rocm::git+https://github.com/ROCmSoftwarePlatform/tensorflow-upstream#branch=develop-upstream'
-...
-```
-I have not tested this.
-
## Building for other machines
-This package uses `-march=native` to enable CPU specific optimizations. If you are building for other people to use do:
+This package uses `-march=native` to enable CPU specific optimizations. If you are building for other people, do:
```sh
sed "s/-march=native/-march=x86-64/" -i PKGBUILD
```
@@ -45,12 +25,12 @@ If you are building for a graphics card not installed on your machine, such as i
you must create the file `/opt/rocm/bin/target.lst` and add each gfx architecture you want to build for.
Something like:
```sh
-echo -e "gfx900\ngfx904\ngfx906\ngfx908\ngfx90a\ngfx1030" | sudo dd of=/opt/rocm/bin/target.lst
+echo -e "gfx900\ngfx904\ngfx906\ngfx908\ngfx90a\ngfx1030" | sudo tee /opt/rocm/bin/target.lst
```
As of 2023-07-05, I have confirmed that RX 580 (gfx803) is NOT supported.
I have confirmed that you can use RX 6750 XT (gfx1031) by setting `target.lst` before building:
```sh
-echo -e "gfx1030" | sudo dd of=/opt/rocm/bin/target.lst
+echo -e "gfx1030" | sudo tee /opt/rocm/bin/target.lst
```
and by setting this environment variable before running:
```sh
@@ -59,10 +39,51 @@ export HSA_OVERRIDE_GFX_VERSION=10.3.0
You can find your gfx by running `/opt/rocm/bin/rocminfo | grep gfx` after installing `rocminfo`.
## Fixing this PKGBUILD when it inevitably breaks
-To learn what the build environment is supposed to look like, see the official Dockerfile for building tensorflow: [ROCmSoftwarePlatform/tensorflow-upstream in tensorflow/tools/ci_build/Dockerfile.rocm](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/tools/ci_build/Dockerfile.rocm). That's where I found out about `target.lst` among other things; hopefully it helps.
+To learn what the build environment is supposed to look like, see the official Dockerfile for building tensorflow: [tensorflow/tools/ci_build/Dockerfile.rocm in ROCmSoftwarePlatform/tensorflow-upstream](https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/tools/ci_build/Dockerfile.rocm). That's where I found out about `target.lst` among other things; hopefully it helps.
+
+If you are completely unable to build the repository, you may have luck with installing AMD's pre-built python wheels. pip seems to think tensorflow-rocm is only supported up to python 3.10, so you will have to download the wheel [directly from AMD's website](http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r213-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/) and install it: `pip install tensorflow_rocm-2.13*cp311-cp311-manylinux2014_x86_64.whl` When I tried it, it did seem to run on GPU and not CPU.
+
+If AMD's website no longer hosts the python wheels, you may find an equivalent by following these breadcrumbs:
+* [tensorflow/tensorflow](https://github.com/tensorflow/tensorflow/) ->
+* [Community Supported TensorFlow Builds](https://github.com/tensorflow/build#community-supported-tensorflow-builds) ->
+* [Linux AMD ROCm GPU Stable : TF 2.x Build Status Release 2.12](http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/)
+
+Lastly, it appears the official way to use TensorFlow on ROCm is by a docker image. You may get random permission errors while downloading because it takes forever to download, and your system clock may desynchronize with the server. To fix the permissions errors, `sudo systemctl start systemd-timesyncd`. Otherwise, follow [the official instructions here](https://hub.docker.com/r/rocm/tensorflow) or [my longer tutorial here.](https://github.com/mpeschel10/test-tensorflow-rocm)
-If you are unable to build this, you may have luck with installing the python wheels. pip seems to think tensorflow-rocm is [only supported up to python 3.10](https://pypi.org/project/tensorflow-rocm/), so you will have to download the wheel [directly from AMD's website](http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/) and install it: `pip install tensorflow_rocm-2.12*cp311-cp311-manylinux2014_x86_64.whl`
+## Tensorflow 2.14
+This PKGBUILD is not able to build Tensorflow 2.14 as a drop-in replacement. I suggest three changes to get you started; after that, you're on your own.
+
+ 1. Update the "last good commit" hash.
+You can find a good commit hash from [AMD's Jenkins CI server.](http://ml-ci.amd.com:21096/job/tensorflow/job/nightly-rocmfork-develop-upstream/job/nightly-build-whl/lastSuccessfulBuild/)
+Then update the PKBUILD like:
+
+```sh
+...
+#_known_good_commit=c19cfffe476ec3338fb24ed3ce2baabfc558076e
+ _known_good_commit=81b90075b3309e3c538915d212f0149daf9cd2a6
+...
+```
+
+ 2. Change the branch of the git source link.
+```sh
+...
+#source=('tensorflow-upstream-rocm::git+https://github.com/ROCmSoftwarePlatform/tensorflow-upstream#branch=r2.13-rocm-enhanced'
+ source=('tensorflow-upstream-rocm::git+https://github.com/ROCmSoftwarePlatform/tensorflow-upstream#branch=develop-upstream'
+...
+```
+
+ 3. Add the `libxcrypt-compat` dependency.
+```sh
+...
+#makedepends=('python-numpy' 'git' 'python-wheel' \
+ makedepends=('python-numpy' 'git' 'python-wheel' 'libxcrypt-compat'\
+...
+```
+Then
+```sh
+pacman -S libxcrypt-compat
+```
-Lastly, it appears the officially intended way to use TensorFlow on ROCm is by a docker image. It takes forever to download, so your system clock may desynchronize with the server, so you may get random permission errors while downloading. If you do, `sudo systemctl start systemd-timesyncd`. Otherwise, follow [the official instructions here](https://hub.docker.com/r/rocm/tensorflow) or [my longer tutorial here.](https://github.com/mpeschel10/test-tensorflow-rocm)
+Good luck.
### Pull requests welcome.