blob: d29c7afc9cc48513d3436acb1fcccdcff3c3aafd (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
remove unneeded architectures from targets.lst
test what happens with out-of-date libicu-32 thingy; update github readme
also tell people about the gfx thing targets.lst and it builds for things you have/have not
clean chroot build
it occurs to me, about the opt things, we can just do march=native.
build with match=x86_64 -O2 and march=native -O2 and benchmark the results.
party. I am so glad this is over.
# Maybe
https://wiki.archlinux.org/title/Arch_packaging_standards -> reproducible
make test.py actually fail if something goes wrong
set up "auto-build" system. One script to run, log, then test. etc.
systemd timer, stuff like that
check latests builds and the thing from official tensorflow pointing to official rocm build
more complete testing of tensorflow. What other functions are there? can I do gpt-2?
there are official tests... somehow...
documentation
what environment variables can you use to suppress verbose logging?
export TF_CPP_MAX_VLOG_LEVEL=-1 # optional; see https://github.com/RadeonOpenCompute/ROCm/issues/1594
what about the other thing? NUMA returned -1 or something
document other ways to install
publicity
can we patch the whole project to bazel 6+ lol
recruit someone with a different gfx architecture to test
actually, just post to the aur and see if there's even any interest. The Docker container works well enough.
I have an amd cpu and do not now how to confirm if haswell optimization is working or not.
# Notes
Working builds:
Download and run the rocm/tensorflow docker image
https://github.com/mpeschel10/test-tensorflow-rocm
Download and install the wheel
# For python older than 3.11, pip has builds from 3.7 to 3.10
# find rocm last release via https://github.com/tensorflow/build#community-supported-tensorflow-builds
# to http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild
# to http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/artifact/packages-3.11/tensorflow_rocm-2.12.0.550-cp311-cp311-manylinux2014_x86_64.whl
curl http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/artifact/packages-3.11/tensorflow_rocm-2.12.0.550-cp311-cp311-manylinux2014_x86_64.whl > ~/Downloads/tensorflow_rocm.whl
pip install tensorflow_rocm.whl
Compile it:
Last functional build on jenkins was http://ml-ci.amd.com:21096/job/tensorflow/job/release-rocmfork-r212-rocm-enhanced/job/release-build-whl/lastSuccessfulBuild/
Possibly from https://github.com/ROCmSoftwarePlatform/tensorflow-upstream, branch r2.12-rocm-enhanced
revision 18ddd5aa0329993f581bdb433b999b85c15f69e3
copy most of the stuff from the docker build file
makepkg
it works!!!
test with gfx803
does not work
|