Wondering if anyone is seeing issues with the Arch update to llvm-libs 13.0.1-3?
I'm chasing ghosts now. :(
Git Clone URL: | https://aur.archlinux.org/opencl-amd.git (read-only, click to copy) |
---|---|
Package Base: | opencl-amd |
Description: | ROCr OpenCL stack, supports Vega 10 and later products - Legacy OpenCL stack (Proprietary), supports legacy products older than Vega 10 - This package is intended to work along with the free amdgpu stack. |
Upstream URL: | http://www.amd.com |
Keywords: | amd amdgpu computing gpgpu opencl radeon |
Licenses: | custom:AMD |
Conflicts: | rocm-opencl-runtime |
Provides: | hip-runtime-amd, hsa-rocr, hsa-rocr-dev, hsakmt-roct-dev, opencl-driver, rocm-core, rocm-device-libs, rocm-hip-runtime, rocm-language-runtime, rocm-ocl-icd, rocm-opencl, rocm-opencl-dev, rocm-opencl-runtime, rocminfo |
Submitter: | grmat |
Maintainer: | sperg512 (luciddream) |
Last Packager: | luciddream |
Votes: | 118 |
Popularity: | 1.25 |
First Submitted: | 2016-12-01 03:45 (UTC) |
Last Updated: | 2022-05-11 17:52 (UTC) |
Wondering if anyone is seeing issues with the Arch update to llvm-libs 13.0.1-3?
I'm chasing ghosts now. :(
@Deuchnord @luciddream I suggest always using the same format. So the previous version would have been 22.10.1.50100
. That way, if the "release" (third number*) counter is actually used, it does not appear newly, it just increases. From the fact that it is now 2 you can tell that there was a 1 before but it was omitted.
As a workaround, using yay, it was sufficient for me to delete the 22.10.50100
package from the package cache, then do -S opencl-amd
and select the package for fresh build. That way, the PKGBUILD is discarded and the new version is downloaded.
major.minor.release.package
but I might be wrong because that does not explain the -x ending.@Deuchnord thanks for reporting that. I'm not using a helper to install it so it's easy to miss these things. I will figure out something for the next version!
My AUR helper (yay
) often complains about this package versions, because it's using strange version numbers:
opencl-amd: local (22.10.50100-1) is newer than AUR (22.10.2.50102-1)
The only workaround I found for now is to rename the package's folder in the cache and uninstall then reinstall:
mv ~/.cache/yay/opencl-amd ~/.cache/yay/opencl-amd-bak && yay -Rs opencl-amd && yay -S opencl-amd
I understand that this package sticks to the vendor release numbering, but would still help a lot if something could be done to limitate this issue, since yay
's error is correct (22.10.50100 > 22.10.2.x).
@limsandy it's a minor release so probably not much. But AMD released it before releasing any source code / documentation so I'm not sure what they have changed, and I forgot to compare the directories for any changes. It works fine so made a new release too.
If I had to guess is that this release is only to support new distribution versions like RHEL 9.0 which was also announced the same day.
@luciddream
What's new in the latest version? :P
@Ashark I don't see a reason to exclude it, it's part of the AMDGPU stack.
@luciddream @sperg512 What about excluding orca part from this package? The separate package appeared for it opencl-legacy-amdgpu-pro. And mentioned in list here.
@melvyn2
There is no way to do that unless AMD includes the change in AMDGPU stack. One advantage of opencl-amd
is that it uses the packages from AMD, so people can be somewhat certain that it's safe to use this. I think the best alternative now is to use a previous version of opencl-amd
that works with your GPU.
As an alternative to opencl-amd
people can use binaries from rocm-opencl-runtime
if they are provided and they feel comfortable using them. The current ones are outdated and personally I don't consider them safe when I can't verify their build process. I think we can all hope that opencl-amd
will not be needed in the future and a better package will be provided by AMD for Arch Linux, but currently that's all we have.
p.s I have a new AMD 5900x CPU for my PC, I need a new RAM now, then I think I can start testing rocm-arch
as well and be able to compare with opencl-amd
when necessary.
The https://aur.archlinux.org/packages/rocm-opencl-runtime package now includes a patch to enable the new (ROCm) OpenCL driver to work on GFX8/Polaris 10 GPUs. I realize that this package pulls the binaries from AMD, but I'd like for this package to enable ROCm CL support for polaris GPUs too, if possible.
wait so this works on 4000 series apus now??? edit: it technically does, but "Total global mem: 512 MB" I wish I could increase this, but my laptop is rather retarded and doesn't let me increase it in the bios, so there must be another way right?
Nope, it compiled fine for me the second time. Try again. 👍
I get the same problem as @limsandy in Manjaro Linux
==> Validating source files with sha256sums...
ncurses-6.3.tar.gz ... Passed
ncurses-6.3.tar.gz.sig ... Skipped
==> Verifying source file signatures with gpg...
ncurses-6.3.tar.gz ... cat: write error: Broken pipe
FAILED
==> ERROR: One or more PGP signatures could not be verified!
Finished with result: exit-code
Main processes terminated with: code=exited/status=1
Service runtime: 14.754s
CPU time consumed: 1.946s
Error: Failed to build ncurses5-compat-libs
I tried Hashcat with HIP support which was added today and for some reason benchmark took 5 minutes to complete, and GPU fans were silent while it was running. So I guess that's similar to what @limsandy is saying. But the benchmark numbers are fine so I'm not sure why it took so long. The only thing I noticed in the first run is that gnome tracker indexing was using the CPU (about 8%) and Hashcat was also using the CPU (about 7%).
After a couple of hours I run Hashcat OpenCL benchmark and it took 2 minutes to complete. Then ran Hashcat HIP benchmark for a second time and it also took 2 minutes to complete, while fans were spinning a lot and GPU usage was 99%
The benchmark results are the same in all 3 runs, for some reason. At least I have a software I can use now to make tests with HIP and ROCm 5.1 :)
Yeah, I don't understand why opencl performance under Manjaro is slower than in Windows 10. Even slower than Ubuntu 20.04. I had been able to change the Radeon profile to "profile_peak" which will make the GPU clock stay at max, but will throttle once a certain temperature is reached.
It doesn't make any sense, but I benchmarked VDF times both in Manjaro and Ubuntu, and Ubuntu sets the record fastest VDF time and consistently lower VDF times.
I updated the packages to 22.10.1 and 5.1.1 - I don't see any release notes or any significant changes (blender hip still crashes, AMF still not working, opencl performance is still a bit lower than 5.0)
So I'm starting to understand how this works.... Linux will try to read every file with the extension .rules in the folder /etc/udev/rules.d/
You can rename it with whatever you want, just put the extension .rules afterwards. Mine is obviously renamed to radeon-vega-7.rules
Cool stuff @limsandy, I will give it a try later and see if it affects the scores on my PC.
Oh I'll be damned.... I put this line in /etc/udev/rules.d/30-radeon-pm.rules
KERNEL=="card0", SUBSYSTEM=="drm", DRIVERS=="amdgpu", ATTR{device/power_dpm_force_performance_level}="high"
Everything in single line, that's a space before the ATTR.
Now power_dpm_force_performance_level is set to high after reboot. Still you've got to make sure that power_dpm_state is set to performance. Notice that the drivers is now set to "amdgpu"?
Nope, still doesn't work. I think it's got to do with this file name: /etc/udev/rules.d/30-radeon-pm.rules
Like the udev doesn't read this file unless it's got the right file name.
I'll work something out later, but on a closing remark.... AMD's still gotta work on how the Radeon drivers is not clocking up high/fast enough when there is a work load being placed in the queue. Even after setting the performance level to 'high', I see the Vddgfx downvolting to 0.574V, which means it's allowed to downclock. On the positive side, my APU stays cool at 35-36C which is very near to my ambient temp.
@luciddream,
Thank you for that link. I tried but it didn't work. This file didn't even exist: /etc/udev/rules.d/30-radeon-pm.rules
So I manually create the file and paste this: KERNEL=="dri/card0", SUBSYSTEM=="drm", DRIVERS=="radeon", ATTR{device/power_dpm_force_performance_level}="high"
It turns out my KERNEL name is wrong. On my sytem, it's just "card0". I've made the change accordingly, and restarting my computer now....
I've never done this but I guess https://wiki.archlinux.org/title/ATI#Persistent_configuration should work? In any case post your findings, it might be helpful for other people.
Okay, I think I was celebrating too early. As I said, I was using OpenCL to do VDF verification for running node/farmer for a cryptocurrency program. It can use the CPU (slower) or GPU (with OpenCL) to keep in sync with the current blockchain height.
Under Win 10, with the same computer I'm consistently getting ~0.2 seconds. Under Manjaro, using the CPU, I get 2.7 - 3.0 seconds. With OpenCL, I'm getting 0.3xxx seconds, which is okay if it had been consistent. But sometimes I'm still getting 14 or 18 seconds.
ROCm OpenCL did not work for my APU. Setting the power level to 'high' really helps, but then it always reverts back to 'auto' after reboot. And sudo/root user is needed to change this setting. Can anyone tell me how to write a script that change this setting at reboot with root permission?
sudo nano /sys/class/drm/card0/device/power_dpm_force_performance_level
@limsandy
Good to hear, so ROCm OpenCL (or maybe Orca) works for your APU? Maybe you can share what you did to fix it so we can add it to Arch Wiki or something.
VDF verification times range from 0.3 all the way to 18 seconds.
What is VDF verification? I'm always looking for more tools to use and compare the performance after every release.
@luciddream,
With the latest version 22.10, clinfo doesn't return segmentation fault anymore. So it's a good start.... I was even able to use OpenCL in the program that I wanted to use.... But the performance is sometimes a hit and a miss. VDF verification times range from 0.3 all the way to 18 seconds.
I thought this is due to the Radeon power profiles, so I've been messing with it.... Setting performance to 'high' as per instructed at https://wiki.archlinux.org/title/AMDGPU#Power_profiles
I FIXED IT! Thank you so much for updating this, luciddream!
22.10 works for me without any problems (5700xt). Geekbench gives the same results as always. I still need to do more extensive testing. Thanks!
With the new version I get about 7000 lower score in Geekbench compute, but I also did a BIOS update recently so maybe that's why.
I have made a new release for both opencl-amd
and opencl-amd-dev
, I don't feel very confident about them so please send any feedback.
I want to make some extra changes but I can't really justify taking so much time for a release, so I hit the push "button". The process is straight forward, so I will try before next minor release to have a script or program ready to make updating very fast, compared to what it is now (completely manual).
I'm preparing the new release and I noticed there is already a EULA for any proprietary component we use, since probably forever. So I guess we are already violating that. Maybe it's better to add it in the description of the package.
No issues with amdgpu-llvm, except it takes up allot of space. Even though rocm-llvm is a "depends" of the hip-runtime-amd .deb its probably best it's left in opencl-amd-dev. No idea about EULAs and the proprietary opencl-legacy-amdgpu-pro-icd sorry. Thanks for the hard work!
In the 5.1 release notes there is a notice that the use of closed source rocm parts requires an agreement with their EULA. I wonder how other packages are dealing with this, if there is an example I can check before I make the release (if the closed source parts are necessary)
5.1 / 22.10 has just appeared at https://repo.radeon.com/
Nice, I will try to take a look later in the evening.
Fingers crossed the Blender HIP implementation doesn't require the rocm-llvm provided in the opencl-amd-dev.
Why do you say that? Because of disk size required? Or did you have any other issues with it
Here we go again... 5.1 / 22.10 has just appeared at https://repo.radeon.com/ This version should eventually work with Blender nightly 3.2 compiles. Fingers crossed the Blender HIP implementation doesn't require the rocm-llvm provided in the opencl-amd-dev.
Or somebody does not create a binary repo/pkgbuild for that.
Even if they do (ROCm is already available on the arch4edu repository), there are still some things that will keep opencl-amd
alive, I believe. First it would need to have a good level of trust (who builds the package, how does it build, and does the source code match with the binary), second it would need to be updated fast, and from what I can see, it's a gigantic effort to keep it updated. opencl-amd
provides updates fast because it just copies files, but of course that comes with a few disadvantages.
The only thing, this should be indicated to users. Can you then please mark it in the description and in ArchWiki? Currently it says that it is ROCr, but better maybe something like "ROCr OpenCL and legacy/orca OpenCL repackaged from AMD's ubuntu releases".
Sure, I will update the description on the next release with the descriptions from the amdgpu-install documentation.
opencl-amd follows the pattern of the amdgpu-install deb file It provides the whole Compute package for AMD GPUs
I think you mean running it amdgpu installer with opencl=rocr,legacy
.
If you think the legacy libraries should be it's own package, you are free to create one.
Of course.
Until AMDGPU / ROCm binaries end up in the official repositories, there will always be a need to provide a binary of that compute package.
Or somebody does not create a binary repo/pkgbuild for that.
I don't think there is a reason to continue debating it further.
I am not blaming you. I understand your position. Just wanted to know what that is, because there were lots of comments here, and I did not followed them carefully.
The only thing, this should be indicated to users. Can you then please mark it in the description and in ArchWiki? Currently it says that it is ROCr, but better maybe something like "ROCr OpenCL and legacy/orca OpenCL repackaged from AMD's ubuntu releases".
@Ashark
That is some nice inside details Jeremy got for us, and thanks for contacting him, but I'm afraid it does not affect opencl-amd
. As fun and entertaining this discussion has been, and we both gained some extra knowledge, reminder that opencl-amd
follows the pattern of the amdgpu-install
deb file, which is what I described in my previous comments. I don't intend to change that, unless AMD changes that pattern.
And with arch's rules we should replace ahyphen with underscore.
In order to be better compliant with the Arch rules, and the build id is not really needed (I explained the reason in the previous comments), I think it's better I remove the build id on the next AMDGPU / ROCm release to avoid any further confusion. So it will be 21.51.50100
going forward.
Why do you repack the rocm itself? As it is open source, there is no point in that. Jeremy also does not recommend that.
Is there a reason to explain again why this package exists? It provides the whole Compute package for AMD GPUs, and I (and I think other people too) find it useful. If you think the legacy libraries should be it's own package, you are free to create one. Until AMDGPU / ROCm binaries end up in the official repositories, there will always be a need to provide a binary of that compute package. The advantage of this package is that people can trust it, because it comes from AMD directly (we are only copying their files)
p.s This is getting repetitive and we are clearly have very different opinions on how to deal with the packages, I don't think there is a reason to continue debating it further. All the users of the opencl-amd
and opencl-amd-dev
packages can express their opinions on how and what it packages and I've modified it in the past to include their suggestions, but in the end of the day it's my responsibility (as long as I maintain / co-maintain the packages) to provide my best effort solution for them.
@luciddream, Jeremy Newton explained the versioning, see previous comment. He does not recommend to use 50002 in version, also we should use amd's identifier to be precise. And with arch's rules we should replace a hyphen with underscore. He suggests 21.50.2_1384495-1 for that.
But here another question arises. Why do you repack the rocm itself? As it is open source, there is no point in that. Jeremy also does not recommend that. There are already rocm-opencl-runtime and rocm-hip-runtime packages. This package should only package the legacy (a.k.a. orca) opencl I think.
About versioning scheme used by amd (thanks to Jeremy's explanation):
Example of package name: amdgpu-pro_21.50.2-1384495_amd64.deb
21.50.2
- This is the release number.
21
- this is the year it started development.
5
- fifth release branch created in 2021 from amd's development trunks (resets to 1 each year). Not dependent on months.
0
- is added to allow inserting more releases if required (see below).
2
- this is revision (fix) fox the release. The release originally has no revision (21.50). Then the revision of it may be released (21.50.1).
in rocm packages:
50002
- This is to show that it's tied to the 5.0.2 release of ROCm (5.00.02), this serves very little value to amdgpu-pro binaries.
in amdgpu-pro packages:
1384495
- This is just a unique identifier. Amd stick it in the release field of packages they releases (it is debian_revision, identical meaning as pkgrel in pkgbuilds).
Q: Why/when after 20.40 would amd release 20.45 or 20.40.1 instead of 20.50?
A: When 20.50 was already branched, but it was delayed due to issues, so 20.45 was branched from an earlier point.
Q: Then when it is not applied the logic of second revision like 20.40.1?
A: It is similiar to how mesa has a "21.3" branch, and tags mesa-21.3.0 through mesa-21.3.8 representing different git commits on that branch. Amd has an internal 21.50 branch, and 21.50/21.50.1/21.50.2 represents different points on that branch.
@luciddream Thanks, I asked Jeremy by email. I will write what is his opinion.
@Ashark I agree with keeping the build number for now, but to be honest I don't think it is needed. I was looking for conspiracies, but with a clearer mind now, the versioning for AMD is much simpler than I tried to make it. They just concatenate the two release versions: AMDGPU (21.50.2) and ROCm (50002). So this is actually the release version of the driver as a whole. As a filename, it doesn't make much sense, because they have never released the same AMDGPU release with a different ROCm release, and I doubt they plan to. I guess it's easier for them to just make a new release for both AMDGPU and ROCm stacks, when they want to update either of them. The version numbers seem to align on every release since the first one (21.40). So the next version for ROCm 5.1 will have AMDGPU 21.51, and the installation file will be amdgpu-install_21.51.50100-1_all.deb
How the repositories are structured is documented here but it's very simple and vague. What each number represents is documented here
I've also found Jeremy's email at the AMDGPU Stack documentation. Maybe we can ask him for more information on how they plan to continue making releases, but I don't think they plan to change it, since it already works as it is. (AMDGPU dot ROCm version)
I think it is required to register there to be able to see his email. Probably yes, too much effort for just version number. You know, let's then keep the versioning as it is now. As you said, keeping 50002 is meaningful for rocm. And I also wish that build number 1384495 to be presented, so I can check if this package matches my packages (amdgpu-pro-libgl). I have just replaced needed parts of string in my checker.
@Ashark I guess the closest you could get to an AMD employee that has also packaging knowledge would be Jeremy Newton
Then maybe the Debian mail list for comments from other packagers. But maybe this is too much effort for just a version number.
Do you know where is their mailing list? Another source of communication with them I think may be the email they used for packages (in control information).
@Ashark the pages you linked, have download links to deb files that include the 50002
number. Since the original deb is a package that includes all files for AMDGPU(Pro) and ROCm, I think it's better to keep one version string for all of them, and that's the version string from the amdgpu-install
file, which is what people use to install it on official supported distros. In the end, what AUR helpers or makepkg
does is create a package that uses this versioning scheme to make a similar package to the Ubuntu / CentOS one.
My assumption for why they are doing this is that while the human readable driver release is 21.50.2, that gives them the freedom to silently release an update when something goes wrong, without the need to create new pages for it. But maybe it's wrong assumption. I'm open to discussing it further and removing both numbers if there is feedback from an AMD employee (maybe we can post a question on their mailing list), but I think 21.50.2.50002-1
will be following their current reasoning for package versioning.
from the AMD drivers page
Can you please link it? Because I only see the pages I mentioned. For example, the rx590 product page, the 21.50.2 release page.
50002 is not the build number, it's ROCm version number
I understand that. I saw in PKGBUILD. But where I can see it as a user? Can you edit the url upstream link so it is more precise?
I assume that AMD developers have a reason to link ROCm version with the AMDGPU version
Maybe.
about 90% of the package is ROCm related files
Oh, I see now. Then yes, the 50002 should stay in the version number. But then another question. If this package is actually rocm, than where did that 21.50.2 came from? Should not this package version be like just 50002_72-1?
@Ashark The deb I linked is the one from the AMD drivers page. 50002 is not the build number, it's ROCm version number so I assume that AMD developers have a reason to link ROCm version with the AMDGPU version. For opencl-amd
it makes even more sense, because about 90% of the package is ROCm related files, and not amdgpu files.
So I disagree with you, I think the correct version should be 21.50.2.50002-1
like the upstream version AMD is using. Worst case should be 21.50.2.50002_1384495-1
but I think it would be fine to remove the minor version.
p.s Unless an AMD employee clarifies the versioning scheme I think all we can make is assumptions on why it is like this.
@luciddream I do not get you, where did you see that "21.50.2.50002" besides the one of debs you linked? In the gpu page I see they mention releases just as "21.50.2", without build number. And in the actual packages they use "21.50.2-1384496", see the Packages file I linked in previous comment or see any control file for any [there are few exceptions, like amf] deb package from repo. So the only correct way I see is to use scheme like I wrote before: 21.50.2_1384495-1.
Hi @Ashark I took a look and the upstream version (both for CentOS and Ubuntu) is 21.50.2.50002
. So I guess I can remove the minor version .1384495
on the next update, but it will still look almost the same. I'm not sure if that helps you.
edit: to make things "worse", there are two amdgpu-install
files, one that includes the minor version and one that doesn't, and I'm too sleepy to check their differences now. So I'm not sure that even removing the minor version will be OK.
@redshoe I didn't ignore you but I didn't have time to check yet. I think that you mean the PAL drivers though. This package is already using the legacy AMDGPU-PRO drivers, but I don't have an older GPU to check what works and what not.
Can you please fix versioning scheme? Currently it is like 21.50.2.50002.1384495-1, but it should be like 21.50.2_1384495-1. I want to compare opencl-amd and amdgpu-pro-libgl versions in davinci-resolve-checker, and it would be easier if version schemes just matched.
Regarding the underscore instead of dot in last block: see https://wiki.archlinux.org/title/PKGBUILD#pkgver. It says if the author of the software uses hyphen (-), replace it with an underscore (_). And AMD versions their packages with hyphen, like Version: 21.50.2-1384496. You can see it in Packages file: http://repo.radeon.com/amdgpu/21.50.2/ubuntu/dists/bionic/proprietary/binary-amd64/Packages.
@luciddream Thanks. Another question. Are newer AMDPRO drivers don't have opencl anymore, as in no more opencl support for GPUs from the AMDPRO-GPU driver stack?
@redshoe I don't think so, libtinfo5 is required by ROCm. I don't think you can "trick" it like that.
@limsandy I think next ROCm release will be in April so you can try again then. If AMD doesn't have that information noone else will unfortunately :)
It would be nice if more people can verify what @ghostbuster is saying, does the path get expanded for you in /etc/profile.d/opencl-amd.sh
? Because I don't see any issues on my PC.
@luciddream is it possible to use libtinfo.so.6
instead of libtinfo.so.5
? And maybe create a symbolic link? If so, can we prevent installing ncurses5-compat-libs
, and just install ncurses6.3-2
?
Note to self: opencl-amd ver 20.40.xxx doesn't need ncurses-lib
So I installed opencl-amd ver 20.40.xxx, I can confirm that Vega iGPU works with Manjaro. I've got openCL ver 2.1 now. Any idea when the new version will work with my Vega 7 iGPU? :D
https://drive.google.com/file/d/1lPzlzX3dkZT82br7rMRdS45MutD89tGV/view
@ghostbuster Are you using the latest opencl-amd
version? I noticed that the PATH got expanded with the previous version so I changed it, and the current one worked fine on my PC.. but If you are seeing it expanded then it means it's not working as I thought. I will check it out later tonight when I'm at my PC.
/etc/profile.d/opencl-amd.sh is created with the expanded PATH at package build time, by adding additional quotation this could be avoided.
diff --git a/PKGBUILD b/PKGBUILD
index bf83107..34910e4 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -183,5 +183,5 @@ package() {
echo /opt/rocm-5.0.2/hip/lib >> "$pkgdir/etc/ld.so.conf.d/opencl-amd.conf"
mkdir -p ${pkgdir}/etc/profile.d
- echo export PATH="\${PATH}:/opt/rocm-5.0.2/bin:/opt/rocm-5.0.2/hip/bin" > "$pkgdir/etc/profile.d/opencl-amd.sh"
+ echo 'export PATH="${PATH}:/opt/rocm-5.0.2/bin:/opt/rocm-5.0.2/hip/bin"' > "$pkgdir/etc/profile.d/opencl-amd.sh"
}
I don't get it.....
I uninstalled opencl-amd, opencl-amd-dev and ncurses5-compat-libs.
Then out of curiosity, I did clinfo again.... And here is what it returned:
https://drive.google.com/file/d/1QQ9t-3tOTTMS19JpNOyFwOqJIXiziLpU/
I am particularly concerned with this line: fatal error: cannot open file '/usr/share/clc/gfx909-amdgcn-mesa-mesa3d.bc': No such file or directory Preferred work group size multiple (kernel) <getWGsizes:1504: create kernel : error -46>
I guess I have opencl-mesa 21.3.7-1 installed.
Will try opencl-amd 20.40 now.
@limsandy
I don't see anything missing from the strace, so i don't think there is something to do on package level. I guess you can try 20.40 for now.
Never assume anyone would have strace installed, because it's not installed by default in Manjaro. :P
https://drive.google.com/file/d/1WruIQJxNqVgUmd74r4LFd2qis87CatDS/
@limsandy, you can try and see if there are any logs that show what's wrong.
Try to run strace clinfo 2> clinfo.txt
, try the debug flags. If it's not working and you can't find a solution, you can always install 20.40 package as we discussed in the previous comments.
With opencl-amd uninstalled, I ran clinfo and it returned "Number of platforms 0".
After installing the latest version of opencl-amd, I ran clinfo again and it returned "Segmentation fault (core dumped)".
So I tried installing opencl-amd-dev and it returned the same error message. Do I need to mess with AMD GPU drivers to make it work?
I've pushed a new version for ROCm 5.0.2 because it contains a bug fix. I made some small changes to the PKGBUILD as well so please report if it's not working for you.
@limsandy You can download the 20.40 PKGBUILD, then install the package as explained in the AUR guide.
edit: oops, I had linked 20.50 by accident. fixed now
Can you advise me on how to properly downgrade to version 20.40 and if that doesn't work, I also want to try version 20.45? :P
ncurses5-compat-libs 6.3-1 has been previously installed.
I just re-build the files and run clinfo again. Segmentation fault (core dumped) again.
I believe I am running into the same problem as srahman5317.
@limsandy
ncurses5-compat-libs
is a dependency of opencl-amd
- you need to install it first before running clinfo
.
It's very unlikely that strace returns nothing, it should create a strace.txt
file with hundreds of lines.
==> Validating source files with sha256sums... ncurses-6.3.tar.gz ... Passed ncurses-6.3.tar.gz.sig ... Skipped ==> Verifying source file signatures with gpg... ncurses-6.3.tar.gz ... FAILED (unknown public key CC2AF4472167BE03) ==> ERROR: One or more PGP signatures could not be verified! Failed to build ncurses5-compat-libs
Some warning messages popped while some dependencies were gonna build. Then somehow it was building....
I had updated my Manjaro the latest today, used kernel 5.17rc5.
After building is done, I tried running clinfo and "Segmentation fault (core dumped)". strace clinfo 2> strace.txt and errors.txt returned nothing.
@limsandy
This is the latest stable release (5.0.1 doesn't count since it has experimental binaries), I will wait for the next AMD release (which is probably March) before I can release a new package.
If you need OpenCL and ROCm is not working for you, you can try an older release that works, probably 20.40
@luciddream,
Not yet. To be honest, I am a little scared to try this on my own because so many users have reported problems (and those who managed to make it work, did not find the solution straight-forward).
I am trying to run a crypto farmer program where it can use OpenCL to help it sync up to the current blockchain height. And the Vega iGPU inside a Renoir is actually very powerful to do it.
I'm not in a rush to use OpenCL so I have a few more weeks. BTW, do you plan on releasing a newer stable version of opencl-amd anytime soon? ;) I know AMD is releasing a bunch of drivers in kernel 5.17 so I might wait until then. ;)
@limsandy I don't think there is a reason to make any assumptions about how the drivers work / build / etc. Did you try opencl-amd
and it's not working for you? What are you trying to run and what do the logs provide any errors? example try strace clinfo 2> strace.txt
You can also use the DEBUG flags, maybe they provide more information about any errors.
@srahman5317 sorry about your laptop!
Hello luciddream,
I am a new Manjaro user and I really would like to get my AMD 4650 PRO (Renoir) APU to work with some programs utilizing OpenCL. Now I've just read srahman5317's trouble getting his Renoir laptop to work with OpenCL and getting it to work might be tricky.
Now don't get me wrong, I know someone who has successfully get his Renoir APU to work with OpenCL but he said the process is quite painful. And here I quote him:
"I was somehow able to get OpenCL running on my AMD 4800U iGPU under Linux... it was very tricky and even then it only seemed to work when a monitor was attached and running a GUI. It would not work headless.
AMD's proprietary Linux drivers suck. They're locally compiled kernel modules. So if there's an automatic kernel update from Ubuntu that gets downloaded and if the AMD drivers won't compile with the newer kernel, you're screwed.
What tends to happen is the kernel gets updated and the AMD drivers lag, so if you try to install a fresh Ubuntu machine you have to manually download and install an older kernel, then remove the newer one (and the newer kernel headers) to get the drivers to build."
Having this information, is there anything you could help us to confidently get OpenCL to successfully work with Manjaro (or any Arch Linux distribution)? Thank you so much in advance, luciddream.
@luciddream don't mean to ignore you but my laptop literally broke so I can't provide more details. Thank you for your help so far.
@srahman5317 What's the rocminfo
output? Maybe it gives more info about what's wrong.
@luciddream Thank you for your response. I haven't found a way to make ROCm work with Renoir APUs but I will keep looking. Here are the errors from strace when version 21.50 is installed and I run clinfo
(link through Mega. Text file) https://mega.nz/file/cctUSQrb#PqodPi0VLhDDJ7hjeAx6Xlh8Bdr1jt5YZ__XUlzhtiY
I found something. For some reason librocalution.so
has been compiled with missing library paths, from what I understand. Manually setting LD_LIBRARY_PATH=/opt/rocm/lib:/opt/rocm/hip/lib
will help some HIP software run.
For example Pytorch will run again when I set the path, and Julia AMDGPU will run fine after I set it. Most software won't show logs for these missing libraries. I will investigate it tomorrow since it's super late for me.
I'm posting this here too until opencl-amd-dev
gains some traction.
@srahman5317
That's unfortunate but I would at least try to check logs before giving up (e.g strace clinfo 2> errors.txt
) Maybe there is something we are missing. I see that some people are using Renoir APUs with ROCM somehow.
Hope this doesn't come across in the wrong way - I just want to put the information out there. On a Renoir APU (4700U for me), version 21.50 gives me a segmentation fault when I run clinfo. That being said, the previous version was also giving me errors with clinfo but not a segmentation fault. For now I am back to version 20.40 which worked with APUs. Not much to be done here I believe, but I hope AMD cleans up it drivers in the future to something that works for all it products
That's cool @hpohl, so they actually did fix something for this version :)
Happy to report my RX 6700 XT is working now :)
@apaz opencl-amd
is supposed to only include runtimes for OpenCL and HIP. To be honest the lack of documentation from my part made me do the same mistakes this time as for 4.5.2. So a lot of packages in opencl-amd
might not be necessary for just running OpenCL and HIP stuff.
opencl-amd-dev
should have everything else, which by name should be ROCm LLVM, OpenCL SDK, HIP SDK, Math and Machine Learning libraries. So you need both opencl-amd
and opencl-amd-dev
to run everything ROCm related, and even then, I think some packages are missing from the Ubuntu release (Like HIPfort, Hipcub, Rocprim). What is also missing is Peer 2 Peer communication for devices because that only works with rocm-dkms
Also, it's not clear what dependencies are necessary and for what reason, so a lot of the stuff I put in opencl-amd
it's just a guess. For example, clinfo
will use dependencies like hsa-amd-aqlprofile
but it is not necessary for it to run. Or, hip-runtime-amd
says it depends on rocm-llvm
, but it might only need it when you try to compile stuff, not run it. I don't have the knowledge to distinguish between the two or it would require too much time from my part.
In the end, what I care about is not creating a huge package for those that only want to run OpenCL stuff. I think it can be smaller, but I hope it's not creating issues as it is right now (about 360MB). As it is right now, I managed to upgrade both packages in about 3-5 hours total completely manually so that's also what's important for me, it can even get faster if I automate stuff like hash copy pasting (I'm already building a rust application to do that).
p.s you will still have no luck with 5700XT I think, from what I've read we should hope for an early release in March that will include more support for Navi 1.
(RX 5700 XT) Tried updating opencl-amd-dev (which I already had). Everything works fine: clinfo, glxinfo and geekbench. Also rocminfo works and does not report any errors. Does this mean that ROCm is fully installed? In the next days I'll try pyTorch and some other tests to check ROCm, any suggestions? One question: I have both the opencl-amd and opencl-amd-dev packages, is this normal? Can I remove the first one? Thanks for all the work you do!
Unfortunately I had to push 3 updates, all because I missed something every time. Hopefully not many people updated in between :)
opencl-amd
and opencl-amd-dev
should be now working with ROCM 5.0.. It seems AUR is still bugged and I can't pin comments, so I will make a new pinned comment when it's fixed.
I've also added ncurses5-compat-libs
back to the dependencies. Any feedback welcome! (especially for opencl-amd-dev
- which now includes ML libraries as well and it's over 10GB !!)
@apaz I reported it on the AUR bug tracker and it is now fixed and will be deployed tomorrow with the 6.0.11 AUR update.
@luciddream I also lost the mail notification from AUR.
r9 390X Opencl last was last working on 21.20 opencl-amd=21.20 (e092ee7eb5e8)
To install this version just clone
git clone https://aur.archlinux.org/opencl-amd.git
git reset --hard e092ee7eb5e8cf496256c70177bd6c07f2a059c0
make -si
you can try other versions too, just git log
and copy the hash
@Koppajin I don't think it will change anything, if your GPU worked with previous version it will probably work with this one.
By the way I have the next version of opencl-amd
and opencl-amd-dev
almost ready. But it's too late for me and I need to sleep. I will make some extra tests and deploy tomorrow evening.
Geekbench seems to work fine although a bit slower than previous versions.
If this updates is it going to stop working with my Polaris card? :x
@trougnouf I think there is something wrong with the email notification of AUR.. I never saw your message. I will take a look while updating to new ROCm.
AMD 21.50 drivers / ROCM 5.0 has been released. I will try to make a release as soon as possible (probably on weekend).
I am missing miopen
and likely its dependencies to make python-pytorch-rocm
. I updated your PKGBUILD by removing the link that owns /opt/rocm/
and by adding some provides
elements, s.t. it plays nice with individual components. I still can't get rocblas
(one of the dependencies) to build (something about _Z15gemvt_sn_kernelILb0ELi256ELi4Ei19rocblas_complex_numIdEPKS1_S3_EviiT4_lPKT5_lT2_lS7_lilPT3_
) but I think I'm making progress:
# Maintainer: Carson Rueter <roachh at proton mail dot com>
# Co-Maintainer: George Sofianos
# Release notes https://rocmdocs.amd.com/en/latest/Current_Release_Notes/Current-Release-Notes.html
major='21.40.2'
minor='1350682'
rocm_major='40502'
rocm_minor='164'
amdgpu_repo='https://repo.radeon.com/amdgpu/21.40.2/ubuntu'
rocm_repo='https://repo.radeon.com/rocm/apt/4.5.2'
opencl_lib='opt/rocm-4.5.2/opencl/lib'
rocm_lib='opt/rocm-4.5.2/lib'
hip_lib='opt/rocm-4.5.2/hip/lib/'
amdgpu="opt/amdgpu/lib/x86_64-linux-gnu"
amdgpu_pro="opt/amdgpu-pro/lib/x86_64-linux-gnu/"
pkgname=opencl-amd
pkgdesc="OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
pkgver=${major}.${minor}
pkgrel=3
arch=('x86_64')
url='http://www.amd.com'
license=('custom:AMD')
makedepends=('wget')
depends=('libdrm' 'ocl-icd' 'gcc-libs' 'numactl')
conflicts=('rocm-opencl-runtime')
provides=('opencl-driver' 'rocm' 'rocm-opencl-runtime' 'rocm-cmake' 'hip' 'rocm-llvm')
optdepends=('clinfo' 'opencl-amd-dev' 'opencl-amd-ncurses5' 'ncurses5-compat-libs')
source=(
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-core/rocm-core_4.5.2.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/c/comgr/comgr_2.1.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hip-dev/hip-dev_4.4.21432.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hip-doc/hip-doc_4.4.21432.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hsakmt-roct-dev/hsakmt-roct-dev_20210902.12.3277.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hsa-rocr/hsa-rocr_1.4.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hsa-rocr-dev/hsa-rocr-dev_1.4.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocminfo/rocminfo_1.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hip-runtime-amd/hip-runtime-amd_4.4.21432.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hip-samples/hip-samples_4.4.21432.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/h/hsa-amd-aqlprofile/hsa-amd-aqlprofile_1.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/amdgpu/21.40.2/ubuntu/pool/main/libd/libdrm-amdgpu/libdrm-amdgpu-amdgpu1_2.4.107.40502-1350682_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-device-libs/rocm-device-libs_1.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/o/openmp-extras/openmp-extras_13.45.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-opencl/rocm-opencl_2.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-opencl-dev/rocm-opencl-dev_2.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-clang-ocl/rocm-clang-ocl_0.5.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-cmake/rocm-cmake_0.6.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-dbgapi/rocm-dbgapi_0.56.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-debug-agent/rocm-debug-agent_2.0.1.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-gdb/rocm-gdb_11.1.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-smi-lib/rocm-smi-lib_4.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-utils/rocm-utils_4.5.2.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocprofiler-dev/rocprofiler-dev_1.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/roctracer-dev/roctracer-dev_1.0.0.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-dev/rocm-dev_4.5.2.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-language-runtime/rocm-language-runtime_4.5.2.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-hip-runtime/rocm-hip-runtime_4.5.2.40502-164_amd64.deb"
"https://repo.radeon.com/rocm/apt/4.5.2/pool/main/r/rocm-opencl-runtime/rocm-opencl-runtime_4.5.2.40502-164_amd64.deb"
"https://repo.radeon.com/amdgpu/21.40.2/ubuntu/pool/proprietary/o/opencl-legacy-amdgpu-pro/opencl-legacy-amdgpu-pro-icd_21.40.2-1350682_amd64.deb"
#"https://repo.radeon.com/amdgpu/21.40.2/ubuntu/pool/proprietary/c/clinfo-amdgpu-pro/clinfo-amdgpu-pro_21.40.2-1350682_amd64.deb"
#"https://repo.radeon.com/amdgpu/21.40.2/ubuntu/pool/proprietary/o/ocl-icd-amdgpu-pro/ocl-icd-libopencl1-amdgpu-pro_21.40.2-1350682_amd64.deb"
#"https://repo.radeon.com/amdgpu/21.40.2/ubuntu/pool/proprietary/a/amf-amdgpu-pro/amf-amdgpu-pro_1.4.23-1350682_amd64.deb"
)
sha256sums=(
"37ebac02ab6d27f93f7770a152e8258e34250788ba50df9fac7b954ef51ae4c1"
'78fcfbfd1ece7e71a5c23f2fd6b48c117468fa86fe1b7ee5425fd1de788e9dd3'
"bc0c26fed977a41ca55d5a69fe71aa709206b66e9d68cece4df9b8c17b3fd60c"
"a1efd85f504e5198ea65bd45563da4e50f35ec0d4bfb72f70722122ccec91422"
'c24b1816144d227c3d6252e925a73c39645fb64fe1f08100608bfcd5ba86dc8d'
'3a9c92bb3a286fbabdaf6104f1ca3bd01ebc818b7efbbf2c7bc2fda3dff0ef30'
"5204cdba0a50367d319724f9d32c969aa40d9dcaa33d9a083431533054617afc"
"1c329ae34fdcdc887b1fc1bf6bcf87065deca86288339d4340508c9e7b6d0ed5"
'e288edcb472b46453ddbcf3615be5c553f8aea2a81bfc79f2517e16cb86f2226'
"03525483301535c9640b4f8a31597744a7402ed740d1b5055193159e8544bfbd"
"a709023fb4f73756340ef27d74185dfb72ec640ad4b2b69b8f39b09fe1fb5db6"
'7fa1eb5ee2b8f9b50fd9aac29330c6f741d313d1d95d3218e80b5236e24bc508'
"90d5baf98306ec9d860e653129a16eb3d72ceff0e46a4f23f9b384175ca9af9d"
"9f32534fdc586c82c596420024c0d435967a004974f60a24beae4e69a51d91e7"
'82b86940d8a93ba91430c48dbfa057ec28db94fcbbc12d1fe58bf11950163d4d'
"9422c149707696a205cbf8ee5a8a2ef97af849cf0c42b36e9427b441b6cbe204"
"ea099f4101de2b3323ba102a39c40355e3ccb8455371569efdd04db91bdadddb"
"b7dd2ff40f67057887c7687cf909f773dc831c803bafa039f4d1fb4c3d86dd10"
"fb3e914326ca2b7e6720ecde90e9e3c46df4d2980ae475b1d4aea2058f50747e"
"5f5da4feedc7b5fa46b9f36f6e681f65729efb705322278a6f1f95f68cf8efe7"
"525b2d8149f97041357016215547502e76bc6e7c253d0d3570058f2fb8761361"
"4e3bc61560aec96d17fcda0ba556f18cf9fae8fdcf6d634a0db66de29df63fe1"
"a6d39213ff9dd99561d5077a3953ee3c1813d209bdf9b1a811cf2612e8894914"
"2075bca8d8a0bddf0df3b8693fe6fe541120864088249dfe43acad353f9d1f2e"
"a7d5638e3d8ba370700594443455990244535d47db21204c427fdfa3edb66470"
"663a256ad0a842469c49db224cfd41393c80b3063960385195464544ac33504e"
"e790e05be47c0e61ce335e731a80adf6658a7f5261349c0fde80b75ac8841dcd"
"1dea44e8ac402fcf81e9465e87c1bc842ad3def1641added1ba7f3be97e77ffc"
"24e81b9080a5c39da422d345a8df3fd2df7bbeb819be61d967c93113b307ede1"
'e3df0cc14bcc7ab529694cd0d61258b9fb6cad93319d7a12dbd8b6d87fd94c02'
#'505a1a8da73869dd1f04a18fce3fe3297b43eb22ca5d71148a3e502d28d7f8e9'
#'f58d4d0a43e02c5a0043845b7ad2659f728d61541ec45aae8fb0989150480d9e'
#'0a9e65112e5dbf4a6b7b81d7078381a931657bb803f66ce005ebd4946e423f7f'
)
#Extract .xz files
exz() {
ar x $1
tar xJf data.tar.xz
}
#Extract .gz files
egz() {
ar x $1
tar xfx data.tar.gz
}
package() {
egz "${srcdir}/rocm-core_4.5.2.40502-164_amd64.deb"
egz "${srcdir}/comgr_2.1.0.40502-164_amd64.deb"
egz "${srcdir}/hip-dev_4.4.21432.40502-164_amd64.deb"
egz "${srcdir}/hip-doc_4.4.21432.40502-164_amd64.deb"
egz "${srcdir}/hsakmt-roct-dev_20210902.12.3277.40502-164_amd64.deb"
egz "${srcdir}/hsa-rocr_1.4.0.40502-164_amd64.deb"
egz "${srcdir}/hsa-rocr-dev_1.4.0.40502-164_amd64.deb"
egz "${srcdir}/rocminfo_1.0.0.40502-164_amd64.deb"
egz "${srcdir}/hip-runtime-amd_4.4.21432.40502-164_amd64.deb"
egz "${srcdir}/hip-samples_4.4.21432.40502-164_amd64.deb"
egz "${srcdir}/hsa-amd-aqlprofile_1.0.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-device-libs_1.0.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-opencl_2.0.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-opencl-dev_2.0.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-clang-ocl_0.5.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-cmake_0.6.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-dbgapi_0.56.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-debug-agent_2.0.1.40502-164_amd64.deb"
egz "${srcdir}/rocm-smi-lib_4.0.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-utils_4.5.2.40502-164_amd64.deb"
egz "${srcdir}/rocprofiler-dev_1.0.0.40502-164_amd64.deb"
egz "${srcdir}/roctracer-dev_1.0.0.40502-164_amd64.deb"
egz "${srcdir}/rocm-dev_4.5.2.40502-164_amd64.deb"
egz "${srcdir}/rocm-language-runtime_4.5.2.40502-164_amd64.deb"
egz "${srcdir}/rocm-hip-runtime_4.5.2.40502-164_amd64.deb"
egz "${srcdir}/rocm-opencl-runtime_4.5.2.40502-164_amd64.deb"
exz "${srcdir}/libdrm-amdgpu-amdgpu1_2.4.107.40502-1350682_amd64.deb"
exz "${srcdir}/openmp-extras_13.45.0.40502-164_amd64.deb"
exz "${srcdir}/rocm-gdb_11.1.40502-164_amd64.deb"
exz "${srcdir}/opencl-legacy-amdgpu-pro-icd_21.40.2-1350682_amd64.deb"
cd ${srcdir}/${amdgpu_pro}
sed -i "s|libdrm_amdgpu|libdrm_amdgpo|g" libamdocl-orca64.so
cd ${srcdir}/${amdgpu}
rm "libdrm_amdgpu.so.1"
mv "libdrm_amdgpu.so.1.0.0" "libdrm_amdgpo.so.1.0.0"
ln -s "libdrm_amdgpo.so.1.0.0" "libdrm_amdgpo.so.1"
# legacy
mkdir -p ${pkgdir}/usr/lib
mv "${srcdir}/${amdgpu_pro}/libamdocl-orca64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/${amdgpu}/libdrm_amdgpo.so.1.0.0" "${pkgdir}/usr/lib/"
mv "${srcdir}/${amdgpu}/libdrm_amdgpo.so.1" "${pkgdir}/usr/lib/"
mv "${srcdir}/opt/" "${pkgdir}/"
ln -s "/opt/rocm-4.5.2/hip/bin/.hipVersion" "$pkgdir/opt/rocm-4.5.2/bin/.hipVersion"
mkdir -p "${pkgdir}/opt/amdgpu/share/libdrm"
cd "${pkgdir}/opt/amdgpu/share/libdrm"
ln -s /usr/share/libdrm/amdgpu.ids amdgpu.ids
mkdir -p ${pkgdir}/etc/OpenCL/vendors
echo libamdocl64.so > "${pkgdir}/etc/OpenCL/vendors/amdocl64.icd"
echo libamdocl-orca64.so > "${pkgdir}/etc/OpenCL/vendors/amdocl-orca64.icd"
mkdir -p ${pkgdir}/etc/ld.so.conf.d
echo /opt/rocm-4.5.2/opencl/lib > "$pkgdir/etc/ld.so.conf.d/opencl-amd.conf"
echo /opt/rocm-4.5.2/lib >> "$pkgdir/etc/ld.so.conf.d/opencl-amd.conf"
mkdir "$pkgdir/opt/rocm"
cd "$pkgdir/opt/rocm-4.5.2"
for fn in *; do ln -s "/opt/rocm-4.5.2/${fn}" "$pkgdir/opt/rocm/"; done
cd "$pkgdir/opt/rocm-4.5.2/hip/bin" # link to hipcc necessary for rocblas
for fn in *; do ln -s "/opt/rocm-4.5.2/hip/bin/${fn}" "$pkgdir/opt/rocm-4.5.2/bin/"; done
}
I don't have an opinion on that, I'm just copying files here :) If AMD doesn't know then we can't know either. You can always try on next releases, the package is relatively small.
@luciddream Makes sense. Thank you for all your help. I will stay on 20.40 for now. Would you recommend that I continue to do that in the future or is there going to be a package that has just the amdgpu-pro OpenCL userspace instead of (amdgpu-pro + ROCm)
No remnants, I just saw your post in the forums :)
I just looked around and saw that people have problems with ROCM and APUs, and they got it fixed by reverting to 20.40.
@luciddream I used to have opencl-mesa installed but not since I've made any posts here. Do you see any remnants of opencl-mesa in the strace? It should be uninstalled. I uninstalled both opencl's and then installed just opencl-amd ...
OK. So I downgraded to version 20.40. Everything functions perfectly. So ROCm causes the issue since those elements were introduced in version 21.
@srahman5317
I think that the problem is your GPU can't work with ROCM.. You can try an older version 20.40 of opencl-amd
if you just need OpenCL.
Also I see you had installed both opencl-mesa
and opencl-amd
.. have you removed opencl-mesa
at the moment? Maybe it creates issues.
If it works for you I will make a new pinned comment that explains that.
@luciddream Thank you for looking into the case. I looked through the output of journalctl -b0 -k
and the error seems to be:
Jan 14 12:14:10 hp-machine kernel: amdgpu: qcm fence wait loop timeout expired
Jan 14 12:14:10 hp-machine kernel: amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption
Jan 14 12:14:10 hp-machine kernel: amdgpu: Failed to evict process queues
Jan 14 12:14:10 hp-machine kernel: amdgpu: Failed to quiesce KFD
This (as far as I can tell) causes the GPU to reset and the DE crashes. I looked into the error and it seems like its an old one with ROCm. There didn't seem to any consistent solution though. Pointers as to where to report this would be welcome.
nah, I think it won't help with your case.. I'm not sure that clinfo
crashes to be honest. Maybe check journalctl -b0 -k
to find out what's crashing.
@luciddream thank you so much for your earlier response (Ryzen 4700u - opencl tasks and benchmarks seem to be working but clinfo causes a crash). I extracted clinfo from the deb package and tried that. Ran into the same issue. I'm linking the strace output here (don't know how else to attach it - its a shared file through Mega): https://mega.nz/file/8BNVnIZI#LRYfD_zpOnWMcNDZzZrHdrFuRjK4GRSw206iK3bBBfk
Also just FYI, when I run clinfo, the resulting crash seems to be graphical. I can usually switch to a different tty and upon restart, the compositor doesn't function correctly. I need to re-enable OpenGL before it works again. I hope this helps.
@t3k
Because I created a similar package that is not outdated and doesn't require the GPG validation from users, so it makes everybody's life easier, but it got deleted by a trusted user as duplicate. (You can see it with red color in the dependency list). If there is not a better solution (like maybe adding libtinfo5
as dependency and makepkg figuring it out automatically) I could add it in the future as dependency, but at the moment I'm still salty about it.
@luciddream, thank you for your help, it works now !
But I don't understand, if this package is required for opencl to work, why isn't it included as a dependency ? This is error prone, a lot of people must make the same error as me !
@t3k do you have libtinfo.so.5
library installed? It is required by this package. This is also provided by ncurses5-compat-libs
package.
Hi @luciddream, thanks for your work ! I'm trying to make my 2 Vega cards (56 and 64) using opencl with your package, but /opt/rocm-4.5.2/opencl/clinfo does not list them (number of devices 0 despite platform being detected)
Still, /opt/rocm-4.5.2/bin/rocminfo detects correctly the 2 Vegas
If I'm installing rocm-opencl-sdk, clinfo detects the 2 Vegas, compilation goes through but I've got the C++ stl_vector.h problem (https://giters.com/ethereum-mining/ethminer/issues/2391) everytime miner tries to access the card with clinfo
I've seen on this link (https://community.amd.com/t5/drivers-software/opencl-pal-legacy-platforms-under-ubuntu/td-p/43645) that it should be installed with the option --opencl=pal for opencl to work on Vega, was it the case for this package ?
Thanks for any help you can provide !
@srahman5317
Can you try to download https://repo.radeon.com/amdgpu/21.40.2/ubuntu/pool/proprietary/c/clinfo-amdgpu-pro/clinfo-amdgpu-pro_21.40.2-1350682_amd64.deb
extract the package, and see if that clinfo
works for you ?
You can also try to strace /usr/bin/clinfo 2> strace.txt
and attach the output, maybe we can find out what is wrong.
Don't know if this is super relevent but clinfo command crashes on Ryzen 4000. I get the same behavior on direct ROCm drivers. However, tasks / benchmarks that makes use of opencl seem to be working. They don't work without this package so I appreciate it.
Thank you @luciddream :) I tried building python-pytorch-rocm but it has (miopen rocm rocm-libs rccl) listed as incompatible dependencies. I tried building after removing them in the PKGBUILD and ended up with an error. I don't know if it's related to a missing component from these dependencies, the error is not verbose. Below is the last line before build failure:
[4983/5943] /usr/bin/c++ -DADD_BREAKPAD_SIGNAL_HANDLER -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DFXDIV_USE_INLINE_ASSEMBLY=0 -DGFLAGS_IS_A_DLL=0 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNNP_CONVOLUTION_ONLY=0 -DNNP_INFERENCE_ONLY=0 -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_C10D_GLOO -DUSE_C10D_MPI -DUSE_DISTRIBUTED -DUSE_EXTERNAL_MZCRC -DUSE_RPC -DUSE_TENSORPIPE -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/aten/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/aten/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/cmake/../third_party/benchmark/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/caffe2/contrib/aten -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/onnx -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/third_party/onnx -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/foxi -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/third_party/foxi -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/torch/csrc/api -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/torch/csrc/api/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/caffe2/aten/src/TH -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/caffe2/aten/src/TH -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/caffe2/aten/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/caffe2/../third_party -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/caffe2/../third_party/breakpad/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/caffe2/../aten/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/caffe2/../aten/src/ATen -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/torch/csrc -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/miniz-2.0.8 -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/kineto/libkineto/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/kineto/libkineto/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/torch/csrc/distributed -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/aten/src/TH -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/aten/../third_party/catch/single_include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/aten/src/ATen/.. -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/caffe2/aten/src/ATen -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/caffe2/core/nomnigraph/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/FXdiv/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/c10/.. -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/third_party/ideep/mkl-dnn/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/ideep/mkl-dnn/src/../include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/pthreadpool/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/cpuinfo/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/QNNPACK/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/aten/src/ATen/native/quantized/cpu/qnnpack/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/aten/src/ATen/native/quantized/cpu/qnnpack/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/cpuinfo/deps/clog/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/NNPACK/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/fbgemm/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/fbgemm -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/fbgemm/third_party/asmjit/src -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/FP16/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/tensorpipe -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/third_party/tensorpipe -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/tensorpipe/third_party/libnop/include -I/orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/fmt/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/third_party/gloo -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/cmake/../third_party/gloo -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/cmake/../third_party/googletest/googlemock/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/cmake/../third_party/googletest/googletest/include -isystem /opt/intel/mkl/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/gemmlowp -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/neon2sse -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/XNNPACK/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party -isystem /usr/include/opencv4 -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/cmake/../third_party/eigen -isystem /usr/include/python3.10 -isystem /usr/lib/python3.10/site-packages/numpy/core/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/cmake/../third_party/pybind11/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/ideep/mkl-dnn/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/third_party/ideep/include -isystem /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/build/include -march=native -O3 -pipe -fstack-protector-strong -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION -O3 -DNDEBUG -DNDEBUG -fPIC -march=haswell -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wno-write-strings -Wno-unknown-pragmas -Wno-missing-braces -Wno-maybe-uninitialized -fvisibility=hidden -O2 -fopenmp -DCAFFE2_BUILD_MAIN_LIB -pthread -DASMJIT_STATIC -std=gnu++14 -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_2.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_2.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/torch/csrc/autograd/generated/TraceType_2.cpp.o -c /orb/tmp/Downloads/python-pytorch-rocm/src/pytorch-1.10.0-rocm/torch/csrc/autograd/generated/TraceType_2.cpp
ninja: build stopped: subcommand failed.
@trougnouf With opencl-amd
and opencl-amd-dev
you should be covered (in theory) to do that.
You can find rocm-smi
and more tools installed at /opt/rocm/bin - and some on /opt/rocm/hip/bin. I've not included them in PATH yet, but they will be included in the next release. You should be able to run pytorch-rocm if your GPU supports it. If it does work for you, please comment it here so we know this package is actually useful :)
@luciddream I have no idea. I think they've been lingering in my system since the early days of ROCm-based OpenCL.
rocm-smi was useful to check the load and eventually I would like to use pytorch-rocm, but I don't know whether these packages have any relevance.
@trougnouf opencl-amd
is in conflict with these packages, although this is not defined in the PKBUILD yet.
Why are you trying to install opencl-amd
if you already have these installed? Maybe there is a use case we need to consider that is not covered by the current package.
edit: nvm, I didn't see your edit :D
I currently get the following error:
error: failed to commit transaction (conflicting files)
opencl-amd: /opt/rocm exists in filesystem
It looks like this package makes /opt/rocm/ a link to the specific version, but other packages are using /opt/rocm/:
[trougnouf@d]: ~>$ pacman -Qo /opt/rocm
/opt/rocm/ is owned by hsakmt-roct 4.5.2-1
/opt/rocm/ is owned by miopengemm 4.5.2-1
/opt/rocm/ is owned by rocm-cmake 4.5.2-1
/opt/rocm/ is owned by rocm-smi-lib64 4.5.2-1
edit: I removed these packages, I guess they are not necessary.
It seems that the AUR people decided to delete my opencl-amd-ncurses5
package. I don't have any plans to add ncurses5-compat-libs
as a dependency again because I have no plans of using it, so I guess it will stay as optional (as far as I'm co-maintaining the package)
You are welcome @apaz. I know that this package is basically just copying files from one place to another but it needs a lot more effort than it (probably) looks. I hope at least it's useful for people.
For Pytorch, what I did was install python39, then download Pytorch with ROCM backend support from: https://pytorch.org/get-started/locally/ - The command to do that is something like: python3.9 -m pip3 install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/rocm4.2/torch_nightly.html
Then you can try to run the python sample
But since we have the same GPU I assume you will get the same error that our GPU is not supported :)
I also tried to download the rocm tensorflow docker image, but it's 7GB and I got bored at about 3GB and cancelled it. Maybe someone else with more patience can give it a try.
I tried to install opencl-amd-dev and everything is OK: "Clinfo" reports no errors; "geekbench --compute" gives the values it has always given; "DaVinci Resolve" works.
I have no programming knowledge; I tried to install PyTorch and then run the script found in: https://en.wikipedia.org/wiki/PyTorch It seems to work and I have no error messages (maybe it doesn't involve the GPU; can you provide me with a script to test it?) I use Arch KDE with AMD RX 5700XT (which is not officially supported by ROCm). Thanks for the great work you are doing.
@luciddream Thank you!
@DianaNites I just pushed another release that makes both opencl-amd-ncurses5
and ncurses5-compat-libs
optional so you can choose what you want to install. Ideally we should mark that opencl-amd
depends on libtinfo
but I don't know if it's possible. Someone more experienced with AUR can verify that.
The change adding opencl-amd-ncurses5 makes it impossible to install this with other packages that conflictingly depend on ncurses5-compat-libs, such as rpcs3-bin in my case
What am I supposed to do?
I tried to use pytorch
and I realized that /opt/rocm-4.5.2/lib
is also missing from the library paths. I will probably add it on a next release.
After I added that, I got the error that my GPU is not supported, which is true :p So maybe someone else has better luck with testing that.
Hi all, current release is for driver version 22.10.2 and ROCM 5.1.2. opencl-amd
package includes only OpenCL / HIP runtime.
You also need to use opencl-amd-dev package for ROCm LLVM compiler, OpenCL and HIP SDK.
Please relog / reboot after installing so your PATH gets updated
I've done some very good progress with the full package. The total download size is currently 770MB compared to 152MB for opencl-amd
- because of the llvm package mostly. The installation size is much bigger (about 3.2GB).
I realized that we can't use the same pkgbase - for this exact reason - but it's not an issue because many other packages use different pkgname / pkbase for their multiple dependencies. I will try to finalize it in the weekend, depending on the time available.
I'm still not sure how to test it though, if anyone has an easy example I can try to verify that ROCM works fully please make a comment :)
edit: never mind, there are tons of examples inside ROCM, I've tested some HIP examples and they work fine.
@luciddream I've never really worked with several packages in one pkgbase, but that's essentially how it works. You can check out pkgbases like amdgpu-pro-installer for examples.
Your idea of keeping bare OCL/HIP stuff in this package, and the rest in the others, sounds great! Because I've been so busy with exams and all that junk, I haven't been able to keep up with all the talk about new packages, but sometime within the next few days I can check all the emails out and provide some help and advice :)
32bit should also be easy to add. From what I remember all the .deb names are the exact same but just with i386 (or i686 or something), instead of amd64. So that would be great to go ahead with as well!
p.s Because there will likely be a lot of duplicate code (just look at all the duplicated at/tar commands), it'd be beneficial to create a bunch of functions for any code that's constantly repeated like extracting the debs. Contact me while you're making it, and once you've finished it, and I'll suggest/add some functions for all that!
@sperg512 good catch, I was too tired to think about that when I made the release. We can add it on the next version, although I was trying to understand why it doesn't find it from the original directory and fix the issue at its source.
By the way, about the pkgbase, I see that we can use the same pkbase (opencl-amd
) and add more packages like nvidia-utils or clion is doing. So I can prepare opencl-amd-full
and opencl-amd-libtinfo
as soon as possible (hoping on the next couple of days) and add them to our current pkgbase if I'm not mistaken. Since you are more experienced with AUR than I am, please verify if that's how it works :)
Then when we see that opencl-amd-full
is working fine, we can start removing extra things but keep the parts that are necessary for OpenCL / HIP functionality and add them to the original opencl-amd
package, so it's also functional but also lightweight. That's my thought process for the next step, but I'm open to suggestions :)
p.s someone suggested adding a 32bit version as well, I will go back at the emails and try to add it too, if it's not a lot of crazy hard work.
@apaz This is just for Radeon Instinct MI25 not for the entire line of Vega 10 based chips.
That looks to have been removed in the refactor. In the old PKGBUILDs:
mkdir -p "${pkgdir}/opt/amdgpu/share/libdrm"
cd "${pkgdir}/opt/amdgpu/share/libdrm"
ln -s /usr/share/libdrm/amdgpu.ids amdgpu.ids
I'm not 100% sure if this is needed, but I think it helped fix something. Might be beneficial to add back?
Hello, since the 21.40.2 update I see several instances of error messages when using the CL driver with polaris (RX480) and vega (Vega64)
/opt/amdgpu/share/libdrm/amdgpu.ids: No such file or directory
Printed several times.
It does not seem to affect the driver itself but it was not the case in previous version.
@redshoe Officially, support for Vega 10 has been deprecated: https://rocmdocs.amd.com/en/latest/Current_Release_Notes/Current-Release-Notes.html#amd-instinct-mi25-end-of-life More details: https://github.com/RadeonOpenCompute/ROCm#Hardware-and-Software-Support Navi 10-20 are not even mentioned... Anyway these are the official communications; there's always room to intervene, especially for OpenCL. I'm really hoping to make Blender work with HIP, since it doesn't support OpenCL anymore.
@apaz So, they are dropping Vega10 GPUs (Vega 64 & 56, WX9100 & WX8200) for OpenCL? This is a bad news....
Hi, it seems that amdgpu-pro is now integrated into ROCm, becoming its graphics stack. The installer is the same, namely "amdgpu-install". For now the situation is problematic because Polaris (gfx7) is not officially supported; support for Vega10 (gfx8) will come to an end and support for Navi (gfx10) is not yet in the plans. Official support is only for Vega 20 (gfx9) and CDNA. Also officially they say that gfx7 and gfx10 partially work; but I think they mean only OpenCL and not HIP or anything else. I found these instructions to make amdgpu-install work also in unsupported distros; however with my 5700XT it doesn't work (but I don't have great skills in this field...): https://gist.github.com/FCLC/8c1f4d28d65a2e6d40b82f82c8fe4e08
@luciddream Thanks for the update, good luck with the HIP integration, from what I read its messy.
@L_S This is just a quick update for OpenCL because there was a new release since early December. Nothing serious should have changed from the previous one. I've started working on another version but it will take some time.
@luciddream Thanks for the update! As this package still only provides opencl-driver I assume the hip-runtime-amd component is not integrated yet? I had a go with the Blender version before they disabled HIP on Linux but it does not detect a HIP capable GPU (if you compile hip-runtime-amd from source it detects a HIP GPU but then fails when rendering).
I flagged the package as out of date because I've noticed there is another release for half a month already :) I will stop looking at the amd.com website and only look at the rocm project / instructions from now on. Will see what the update is about and if it's working I will make a release as soon as possible (maybe tomorrow).
@redshoe it's required by both (since opencl is part of rocm now). If it's not installed you can't use clinfo
or any other opencl software.
Currently this is the most secure way to provide this package but I don't like it either. What we can do is create our own package to provide libtinfo
library without the need for the PGP key but that may make some people angry. Everything is a compromise in the software world I guess :D
p.s I have some free time today so I will start experimenting with some changes I have in mind.
@luciddream Is ncurses5-compat-libs
required for rocm
or for opencl
? Do you think it would be okay to not install ncurses5-compat-libs
if it is not required by opencl
?
I have got Radeon Pro W6600 now (RDNA2), so I can make some tests if needed.
Strange, but BMD seems to blacklist that gpu in DR, while RX580 is working fine.
@redshoe Right--I meant Blender. Looks like they're still working on support pre-RDNA cards, so I assume they'll support Vega.
@sperg512 As far as I know Vega chips are supported by ROCM, so it would also support HIP.
I'm not sure if my GCN (vega 10) GPU will be supported by HIP, so I'm not sure how much help I will be; but I've seen something about HIP working on CPU as well so I could be an extra test for that.
Because Blender devs said they will support older than RDNA, if anyone with an older GPU (e.g Polaris or Vega), speak up please!
Sure, we need someone with RDNA2 GPU to help with testing. I see that they also plan to support RDNA so maybe it will be easier for me to test as well.
@luciddream @sperg512 Please note Blender devs currently have HIP disabled in 3.1 as they wait for AMD to release an update so HIP works, expected some time in the next months. So we are ready for that HIP update I think its still worth while getting the current HIP working. I 100% agree this isn't straight forward.
Thanks for any work you can achieve. I'm no good with AUR development but have a RX6800 if you need something tested.
I have my last midterm tomorrow, so after that if needed I can help with adding HIP.
@L_S Hi, the plan is to identify which source packages are needed and create the necessary target packages. I would like to do that on a fresh pkgbase (but keep the opencl-amd
name) in order to keep them better organized.
This process is not so straightforward so, we can start by creating the necessary PKGBUILD files and when we find what is the best possible way to migrate opencl-amd
to a new pkgbase, we can then commit the files.
I noticed that Blender 3.1.0 Alpha has been released today.. so it might be a good time to start working on it. It's easier for me to spend time on weekends for this package so I will try to have something ready as soon as possible. One issue is that I own a 5700XT so I might not be able to test some stuff.
@luciddream Blender 2.9 works great with this package! However openCL has been dropped and the upcoming Blender 3.1 is planning to support HIP on Linux.
You comments below suggest rolling support for HIP into this AUR. 1. Is there any time frame on this? 2. Could having a package called something like hip-amd make it easier to find? Thanks for the Hard work!
Using an RX5700:
21.40 => clCreateCommandQueue(): CL_OUT_OF_HOST_MEMORY
I got to revert the latest update from this AUR git and makepkg the one for 21.30
@redshow I haven't tried it, but I suppose yes you can. opencl-amd
installs these libraries for years, it's not something that I changed recently. And they are not too big either so it's no big deal.
@luciddream So, if we are using newer cards (probably something like Vega?) we don't need this libdrm
part at all? Can I just comment out the libdrm
part?
@Slavius You are right, I kinda missed it when cleaning up the PKGBUILD (I was actually trying to make it work with my GPU first). So I guess we need to bring the sed command back for the latest PKGBUILD. (and remove couple of files that are not needed from the current one)
In general, I haven't pushed a new version yet because I'm thinking about making this package more about rocm in the future and less about opencl. We should have a new pkgbase, a light opencl-amd
package with opencl / hip support - probably like it is now - and a opencl-amd-full
package that will include everything the ubuntu version has.
I actually figured the libdrm changes out. As since mesa comes with libdrm_amdgpu.so as well these would clash so the PKGBUILD renames amdgpu-pro provided libdrm_amdgpu.so
--> libdrm_amdgpo.so
and then uses sed
to replace the name inside libamdocl-orca64.so
as well.
@schnedan ROCM needs libtinfo.so library which is part of ncurses. We have to include the file somehow, using the package ncurses5-compat-libs from AUR is one way to do it. The PGP key is required by that package, more info at the package page.
@redshoe I believe this file is required for the legacy devices (orca) - but I don't have an older GPU to verify. You shouldn't get an error of the file not found since it is being created by the package.
luciddream// I am personally modifying PKGBUILD for professional driver stack, and I could not figure out why the part below is required (just as Slavius mentioned earlier).
cd ${amdgpu}
rm "libdrm_amdgpu.so.1"
mv "libdrm_amdgpu.so.1.0.0" "libdrm_amdgpo.so.1.0.0"
ln -s "libdrm_amdgpo.so.1.0.0" "libdrm_amdgpo.so.1"
It looks like you are shifting the files around to create a symbolic link, but I keep getting the following error.
cannot stat '/home/orangke/archive/opencl-amd-drivers/opencl-amd-21.20-ent2/src/libdrm/opt/amdgpu/lib/x86_64-linux-gnu/libdrm_amdgpo.so.1': No such file or directory
Could you elaborate on this? Thanks.
can you elaborate more details to the PGP validation?
I get an unknown public key error: 702353E0F7E48EDB
What can I do about it and how can I verify this key is thrustworthy?
@hpohl
I don't think it's related, I think it has something to do with your system. Can you disable the iGPU of your Intel processor and see if it changes anything?
@luciddream strace: https://www.dropbox.com/s/pbkf0io41gb4583/strace.txt?dl=0
None of those solutions seems to work :(
Is this related?: https://github.com/RadeonOpenCompute/ROCm/issues/1180
@luciddream strace: https://www.dropbox.com/s/pbkf0io41gb4583/strace.txt?dl=0
None of those solutions seems to work :(
Is this related?: https://github.com/RadeonOpenCompute/ROCm/issues/1180
Anyone able to get this working on 21.30 or 21.40.1 for Hawaii (R9 290, R9 390) cards? I cannot for the life of me get opencl to work on the R9 390 on the latest drivers, but it works fine with my second Tonga r9 380 card. I've settled on Ubuntu 18.04 with the 21.20 driver for now. I've tried nearly everything I found on the internet and numerous distros.
@HurricanePootis I think the answer is obvious, because they are precompiled binaries and you don't have to compile them yourself. Plus you can create any package you want, doesn't mean people have to use it if they don't like it or prefer something else.
Why keep opencl-amd if it uses ROCm, whenever the ROCm runtimes are already in the AUR?
Also try strace -f clinfo 2> strace.txt
I see it can provide extra info about missing files / directories.
@hpohl I don't see anything wrong with the log.. (i'm not an expert on reading it to be honest), and can't really find the issue when it's working for me.
One assumption is that it has something to do with your Intel CPU... maybe try installing intel-compute-runtime
and see if that fixes anything?
I also noticed from your log that clinfo
is using more libraries than this package has at the moment, specifically libhsa-amd-aqlprofile64.so
.. I will probably add this in a second version of this package.
@hpohl
Try this command:
sudo mount -o remount,exec /dev
References are Phoronix:
Which in turn refers to:
@apaz yes, this looks to be working for OpenCL as the previous versions, but, I think it's much easier now to add ROCM support for Pytorch and Tensorflow, and see that HIP is working. However I can't test it on my PC because 5700XT is not supported I believe. I've found and added a few libraries that should be able to do that (rocrtrace / rocrand / rocblas), but I can't verify they are working and I don't want to add more size to the package (rocblas is like 600+ MB of libraries). If someone with a supported GPU can test it I can attach another PKGBUILD for that purpose.
@esistgut yes, this is a precompiled binary from the Ubuntu repository. You have to compile the other one yourself.
@hpohl Try to LD_DEBUG=all clinfo 2> lddebug.txt or strace clinfo
so we can see what is missing.
Thanks! Works fine with Radeon R7 M265. clinfo shows no errors, hashcat benchmark runs as usual.
I see this packages uses a lot of ROCm stuff now. Is this still different than rocm-opencl-runtime?
A little update from my RX 6700 XT, which is still not detected by clinfo. I'll stick to 20.45 for the time being.
Thanks for the update. Everything works for me with an RX 5700 XT (but I must say that it's been a long time since I've had any problems): "clinfo" reports no errors; "Geekbench --compute" is fine; "Blender" uses OpenCL as well "as DaVinci Resolve". I'll give you some info in case you might find it useful. Installation with an AUR Helper does not work because of the pgp key. Installing with "makepkg -si" works without problems. I had to install "ncurses5-compat-libs 6.2-1", which added itself to "ncurses 6.3-1" that I already had in the system. Trying a "pacman -Qi" I got an error that I don't understand, but it seems unimportant:
$ LC_ALL=C pacman -Qi ncurses5-compat-libs 6.2-1
Name : ncurses5-compat-libs
Version : 6.2-1
Description : System V Release 4.0 curses emulation library, ABI 5
Architecture : x86_64
URL : http://invisible-island.net/ncurses/ncurses.html
Licenses : MIT
Groups : None
Provides : libtinfo5
Depends On : glibc gcc-libs sh
Optional Deps : None
Required By : opencl-amd
Optional For : None
Conflicts With : libtinfo5
Replaces : None
Installed Size : 624.92 KiB
Packager : Unknown Packager
Build Date : Sat Nov 13 09:21:49 2021
Install Date : Sat Nov 13 09:22:37 2021
Install Reason : Installed as a dependency for another package
Install Script : No
Validated By : None
error: package '6.2-1' was not found
EDIT: I'm really stupid! I didn't see that I wrote a 6.2-1 too many!!! Sorry
I've pushed version 21.40.1 based on the Ubuntu repository. It's a very different release than the previous ones so this may require more changes in the future.
This package now needs ncurses5-compat-libs which is a pain to install because of the PGP validation necessary. Any suggestions on how to make peoples life easier is welcome. I guess we can create a new package opencl-amd-ncurses5 which will copy that PKGBUILD and will not require PGP validation.
I only verified that OpenCL works on my PC. There are also tons of tools for the Rocm platform that people might need to add to this package (like HIP / MiOpen / etc), so feel free to suggest what would be nice to have.
It would be nice if someone with an older GPU can see if /usr/lib/libdrm_amdgpo.so*
is still needed. I've included the files in the package because I assume they still are.
Please comment if something is not working for you so I can investigate :)
Have a good weekend!
Hi all there is a new version of the drivers. I've been working on it and I will release a new package as soon as I have it working on my PC and GPU (5700xt) - Hopefully tonight -
@Slavius I think he just changed them for copying and linking the files.
I was studying the PKGBUILD and I noticed 2 strange things: 1) there is a symbolic link created from "libamd_comgr.so.2.0.0" -> "libamd_comgr.so", however there's no such file packed into this installer. There's "libamd_comgr.so.2.1.0" from comgr-amdgpu-pro_2.1.0-1290604_amd64.deb. Is this a typo? 2) The "libdrm_amdgpu.so.1.0.0" with its symlink is being renamed to "libdrm_amdgpo.so.1.0.0" -> "libdrm_amdgpo.so.1". Why the change from "u" to "o" in the name?
OpenCL won't work on RX460 with latest release here:
https://community.amd.com/t5/drivers-software/linux-opencl-21-30-broken-with-rx460/td-p/491173
Just to note that the R9 390 stops working with the update to 21.30. Downgrading to 21.20 fixes this. I guess AMD may have dropped support for the 390 in their driver.
I'm getting the same issue as @IMBJR
Switching to the mesa OpenCL driver or cpu rendering works as expected.
Images: https://imgur.com/a/z6JoqI5
I don't see anything wrong in the shared libraries:
ldd -v /usr/bin/clinfo
linux-vdso.so.1 (0x00007fff631f6000)
libOpenCL.so.1 => /usr/lib/libOpenCL.so.1 (0x00007fc7ab84b000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fc7ab844000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007fc7ab678000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fc7ab8d3000)
Version information:
/usr/bin/clinfo:
libdl.so.2 (GLIBC_2.2.5) => /usr/lib/libdl.so.2
libOpenCL.so.1 (OPENCL_1.0) => /usr/lib/libOpenCL.so.1
libc.so.6 (GLIBC_2.3) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.7) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.14) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /usr/lib/libc.so.6
/usr/lib/libOpenCL.so.1:
libdl.so.2 (GLIBC_2.2.5) => /usr/lib/libdl.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /usr/lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.3.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.33) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
/usr/lib/libdl.so.2:
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
/usr/lib/libc.so.6:
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
@tyler19820201: I can confirm all packages here build well. But any version after I had tried after 20.40.1147286-1 will make clinfo hang at the end infinitely on my Ryzen 5 4500U. Not sure if would work stable though. So use that version for now. There were rather huge changes at 20.45.1164792 and later when comparing the file list. See also @pikasalt comment 2021-07-26.
Maybe someone has an idea what could be the reason and if this will get fixed in the future. Right now I'm thinking about a new Ryzen 5700G based system where I need basic stable image support for darktable.
FYI I've opened a topic on Arch Forum asking what the process would be to transfer this to a new pkgbase. I'm more worried about that process than maintaining the 32bit parts. But maybe you already know how to do that.
The most important thing is that a lot of people depend on this package for their systems so I don't want to unnecessarily create troubles for them. So if we change the pkgbase we should be extra careful not to break it.
Personally I don't really know if that's necessary. Many users might not need the 32-bit libraries. Probably more than who needs it. Ideally, I think we could simply merge the two into the same PKGBASE, downloading the exact same tarball, and then install from there? I think that would work, correct me if I'm wrong. If it doesn't, it might just be possible to have lib32-opencl-amd
install both, but that'd be a lot of work aswell...
Unfortunately, like luciddream, if they were merged l probably wouldn't be able to maintain the package very well. Especially since I not only switched to Gentoo some time ago (and thus have to update stuff on my Arch server), but also because the school and soccer years just started.
However I think a while back someone sent me a PKGBUILD that was a lot better than the current one. I think I'll grab that one and see if it's possible to adapt it to include 32-bit libs
@luciddream yes,lib32-opencl-amd is not a complete package like opencl-amd and you only have to deal with two deb files.
If you read my patch below, you will see it uses generally the same command as opencl-amd to extract files from orca and libdrm, just to substitue "/usr/lib" with "/usr/lib32" and "64" with "32", provide $shared_32 and $shared2_32.
To test if 32bit opencl works, use clinfo from clinfo-amdgpu-pro_21.30-1290604_i386.deb.
@maz-1 actually let me test it a bit locally tomorrow maybe it's easier than I thought it is. But if anyone wants to give their opinion feel free to do so.
@maz-1 I did thought about it a bit and I still don't see a reason to merge these packages. Personally I spend at least 1 hour reading all changes and comparing files etc when I update this package, and I will not be able to do that when I can't verify if the files are working properly. Plus it will double the work needed for each update.
If there is popular demand for it though maybe it's OK for someone to merge it but I don't think I will be able to keep maintaining the package in that state.
(p.s I'm not sperg512, I just co-maintain this package the last year or so).
@sperg512 I think seperate lib32 packages are for packages built from source. lib32-opencl-amd share the same zip file with opencl-amd so it should be ok to built both packages from one package base. Besides lib32-opencl-amd requires opencl-amd and install it will download the same zip file twice if lib32-opencl-amd is a seperated package. You can check out nvidia-vulkan and amdgpu-pro-installer in aur.
@maz-1 why not create another package lib32-opencl-amd
? It looks to me most AUR packages use that strategy for 32bit software. Having it in one package will make it harder to maintain and to verify updates. But that's just my opinion.
@sperg512 I modified PKGBUILD to build lib32-opencl-amd, can you check if these changes can be merged? https://gist.github.com/maz-1/d08c16f84c8a0237c38141bd49a8d55c
I have tested with 32bit clinfo
my changes:
diff --git a/PKGBUILD b/PKGBUILD
index bdb81d7..ca51492 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -4,6 +4,7 @@
# Contributor: ipha <ipha00 at gmail dot com>
# Contributor: johnnybash <georgpfahler at wachenzell dot org>
# Contributor: grmat <grmat at sub dot red>
+# Contributor: maz-1 <ohmygod19993 at gmail dot com>
prefix='amdgpu-pro-'
postfix='-ubuntu-20.04'
@@ -12,8 +13,16 @@ minor='1290604'
amdver='2.4.106'
shared="opt/amdgpu-pro/lib/x86_64-linux-gnu"
shared2="opt/amdgpu/lib/x86_64-linux-gnu"
+shared_32="opt/amdgpu-pro/lib/i386-linux-gnu"
+shared2_32="opt/amdgpu/lib/i386-linux-gnu"
tarname="${prefix}${major}-${minor}${postfix}"
+pkgbase=opencl-amd-installer
+pkgname=(
+opencl-amd
+lib32-opencl-amd
+)
+
pkgname=opencl-amd
pkgdesc="OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
pkgver=${major}.${minor}
@@ -22,17 +31,58 @@ arch=('x86_64')
url='http://www.amd.com'
license=('custom:AMD')
makedepends=('wget')
-depends=('libdrm' 'ocl-icd' 'gcc-libs' 'numactl')
-conflicts=('rocm-opencl-runtime')
-provides=('opencl-driver')
-optdepends=('clinfo')
DLAGENTS='https::/usr/bin/wget --referer https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-21-30 -N %u'
source=("https://drivers.amd.com/drivers/linux/$tarname.tar.xz")
sha256sums=('5840aac63a3658b3f790c59e57226062e7e4bc74f3c066a3e7bc9e3065e24382')
-package() {
+package_lib32-opencl-amd() {
+ pkgdesc="OpenCL 32bit userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
+ license=('custom:AMD')
+ provides=('lib32-opencl-driver')
+ depends=('lib32-libdrm' 'lib32-ocl-icd' 'lib32-gcc-libs' 'opencl-amd')
+
+ mkdir -p "${srcdir}/opencl32"
+ cd "${srcdir}/opencl32"
+
+ # orca
+ ar x "${srcdir}/$tarname/opencl-orca-amdgpu-pro-icd_${major}-${minor}_i386.deb"
+ tar xJf data.tar.xz
+
+ cd ${shared_32}
+ sed -i "s|libdrm_amdgpu|libdrm_amdgpo|g" libamdocl-orca32.so
+
+ mkdir -p "${srcdir}/libdrm32"
+ cd "${srcdir}/libdrm32"
+ ar x "${srcdir}/$tarname/libdrm-amdgpu-amdgpu1_${amdver}-${minor}_i386.deb"
+ tar xJf data.tar.xz
+ cd ${shared2_32}
+ rm "libdrm_amdgpu.so.1"
+ mv "libdrm_amdgpu.so.1.0.0" "libdrm_amdgpo.so.1.0.0"
+ ln -s "libdrm_amdgpo.so.1.0.0" "libdrm_amdgpo.so.1"
+
+ mv "${srcdir}/opencl32/etc" "${pkgdir}/"
+ mkdir -p ${pkgdir}/usr/lib32
+
+ # orca
+ mv "${srcdir}/opencl32/${shared_32}/libamdocl-orca32.so" "${pkgdir}/usr/lib32/"
+ mv "${srcdir}/libdrm32/${shared2_32}/libdrm_amdgpo.so.1.0.0" "${pkgdir}/usr/lib32/"
+ mv "${srcdir}/libdrm32/${shared2_32}/libdrm_amdgpo.so.1" "${pkgdir}/usr/lib32/"
+
+ rm -r "${srcdir}/opencl32"
+ rm -r "${srcdir}/libdrm32"
+
+}
+
+package_opencl-amd() {
+ pkgdesc="OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
+ license=('custom:AMD')
+ depends=('libdrm' 'ocl-icd' 'gcc-libs' 'numactl')
+ optdepends=('clinfo')
+ provides=('opencl-driver')
+ conflicts=('rocm-opencl-runtime')
+
mkdir -p "${srcdir}/opencl"
cd "${srcdir}/opencl"
@tyler19820201 I have similar model APU (Ryzen 7 4750U) and I was successful with the package version of 20.40.
@tyler19820201 can you copy paste the command you are using and the error you are getting?
Could it be that your download got corrupted or your RAM is not stable and corrupted the package? I would try to re-download the package if possible.
I have a new laptop with Ryzen 7 4700U with Radeon graphics ATI 05:00.0 Renoir integrated GPU. I want install this package but it giving me error that the package did not pass the authentication test. I have mesa and xf86-video-ati already installed. Any idea how to install it?
@luciddream Running the ldd command as shown gives:
ldd: /usr/bin/clinfo: No such file or directory
@IMBJR I assume it's because the OpenCL 1.2 library has changed. I was planning to ask yesterday when I updated the package because it's the only big change in the drivers. Can you see if something is missing on the output of ldd -v /usr/bin/clinfo
? If nothing is missing I will pin another comment for Polaris users to stay on 21.20 package.
In Blender: I'm getting just a black object when I try to render the default scene with GPU Compute turned on.
Downgrading to version 21.20.1271047-1 allows the render to be produced normally.
I have an RX 480 GPU, which is a Polaris one. I notice there's a pinned comment about that, but for an older package - and as I say above, the cube renders correctly for the previous version.
Edit: I'm also getting random crashes when I attempt to put Blender into GPU Compute mode.
@HurricanePootis and luciddream - Thanks for the patch. Installs great and Davinci works 100%. No errors and my old projects export without issues. DarkTable still needs to be the GIT version but thats ok. Thank again!
I have created a patch for 21.30 :]
diff --git a/.SRCINFO b/.SRCINFO
index d72fa11..6f88d9f 100644
--- a/.SRCINFO
+++ b/.SRCINFO
@@ -1,6 +1,6 @@
pkgbase = opencl-amd
pkgdesc = OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack.
- pkgver = 21.20.1271047
+ pkgver = 21.30.1290604
pkgrel = 1
url = http://www.amd.com
arch = x86_64
@@ -13,8 +13,7 @@ pkgbase = opencl-amd
optdepends = clinfo
provides = opencl-driver
conflicts = rocm-opencl-runtime
- source = https://drivers.amd.com/drivers/linux/amdgpu-pro-21.20-1271047-ubuntu-20.04.tar.xz
- sha256sums = 8ea051de8c9c6814eb45ce18d102e639bb6edb5786e948b50c5105e3e21978f9
+ source = https://drivers.amd.com/drivers/linux/amdgpu-pro-21.30-1290604-ubuntu-20.04.tar.xz
+ sha256sums = 5840aac63a3658b3f790c59e57226062e7e4bc74f3c066a3e7bc9e3065e24382
pkgname = opencl-amd
-
diff --git a/PKGBUILD b/PKGBUILD
index e435f31..6b9027e 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -4,12 +4,13 @@
# Contributor: ipha <ipha00 at gmail dot com>
# Contributor: johnnybash <georgpfahler at wachenzell dot org>
# Contributor: grmat <grmat at sub dot red>
+# Contributor: HurricanePootis <hurricanepootis@protonmail.com>
prefix='amdgpu-pro-'
postfix='-ubuntu-20.04'
-major='21.20'
-minor='1271047'
-amdver='2.4.100'
+major='21.30'
+minor='1290604'
+amdver='2.4.106'
shared="opt/amdgpu-pro/lib/x86_64-linux-gnu"
shared2="opt/amdgpu/lib/x86_64-linux-gnu"
tarname="${prefix}${major}-${minor}${postfix}"
@@ -30,7 +31,7 @@ optdepends=('clinfo')
DLAGENTS='https::/usr/bin/wget --referer https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-21-20 -N %u'
source=("https://drivers.amd.com/drivers/linux/$tarname.tar.xz")
-sha256sums=('8ea051de8c9c6814eb45ce18d102e639bb6edb5786e948b50c5105e3e21978f9')
+sha256sums=('5840aac63a3658b3f790c59e57226062e7e4bc74f3c066a3e7bc9e3065e24382')
package() {
mkdir -p "${srcdir}/opencl"
@@ -73,7 +74,7 @@ package() {
# roc*
mv "${srcdir}/opencl/${shared}/libamdocl64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/opencl/${shared}/libamd_comgr.so.2.1.0" "${pkgdir}/usr/lib"
- mv "${srcdir}/opencl/${shared}/libamdhip64.so.4.1.21233-" "${pkgdir}/usr/lib"
+ mv "${srcdir}/opencl/${shared}/libamdhip64.so.4.2.21303-" "${pkgdir}/usr/lib"
mv "${srcdir}/opencl/${shared}/libamdhip64.so" "${pkgdir}/usr/lib"
mv "${srcdir}/opencl/${shared}/libamdhip64.so.4" "${pkgdir}/usr/lib"
mv "${srcdir}/opencl/${shared}/libhsa-runtime64.so.1.3.0" "${pkgdir}/usr/lib"
@@ -89,7 +90,6 @@ package() {
# orca
mv "${srcdir}/opencl/${shared}/libamdocl-orca64.so" "${pkgdir}/usr/lib/"
- mv "${srcdir}/opencl/${shared}/libamdocl12cl64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/libdrm/${shared2}/libdrm_amdgpo.so.1.0.0" "${pkgdir}/usr/lib/"
mv "${srcdir}/libdrm/${shared2}/libdrm_amdgpo.so.1" "${pkgdir}/usr/lib/"
You can apply the patch to the files by copying and pasting the text into a file, lets call it 21.30.patch
in the main directory of the PKGBUILD. Then you can run the command
patch -p1 < 21.30.patch
I just noticed there is another release (21.30), I will try to update package later tonight.
@apaz I used to use this and just comment / uncomment when I wanted to change. I suspect it's an issue with 17.2.2 Build 4. I never used to call progl direct until more recently. I'm about to do a fresh install of Manjaro for testing. Either the issue is with 17.2.2 or latest opencl. PS I never used to get ANY error with DRS. My thinking of blaming opencl is the fact DT broke too.
#export LD_LIBRARY_PATH="${HOME}/pro/drivers:$#{LD_LIBRARY_PATH}" #export LIBGL_DRIVERS_PATH="${HOME}/pro/drivers/dri" #export dri_driver="amdgpu" #export QT_DEVICE_PIXEL_RATIO=2 #export QT_SCALE_FACTOR=1
progl /opt/resolve/bin/resolve
@matbonn
I get that error too, but then everything works for me, including Fusion. Try this script (runDavinci.sh) that I put in the same folder (/home/USER/pro) where I put progl and the amdgpu-pro drivers and libraries. It was made by an Arch user who posted it on the Blackmagic forum. The content of the script is (you have to customize the paths):
echo #!/bin/bash
progl() {
export LD_LIBRARY_PATH="${HOME}/pro/drivers:${LD_LIBRARY_PATH}"
export LIBGL_DRIVERS_PATH="${HOME}/pro/drivers/dri"
#export dri_driver="amdgpu"
} && progl && /opt/resolve/bin/resolve
PS: I have 17.2.1 not the last 17.2.2
@apaz I use progl. I don't get any errors opening DRS. I can even open empty projects with no errors, add clips etc. but add a fusion effect or just wait a few seconds and error. The moment I add a fusion effect or open my older projects after a couple seconds. I get "The GPU failed to perform image processing because of an error" Error Code: -1.
rolling log shows
0x7efe637ff640 | DVIP | ERROR | 2021-08-09 17:14:49,617 | Failed to build OpenCL program: - Error: CL_BUILD_PROGRAM_FAILURE - Options: " -w -cl-mad-enable -cl-fast-relaxed-math -Dz323df50901b485739bf3a3b9a84c73b0 -Dz6e436e44fad709e7c0aa0046bd091019 -Dzc229ce7b384e9cbe83e58608fba7c36d" - Build log: lld: error: undefined hidden symbol: z1536d0ff067facba3ae1f450c0a9a893
referenced by /tmp/comgr-2a94eb/input/linked.bc.o:(z54ee33ec600f7cd087f01ab97fdc688b) referenced by /tmp/comgr-2a94eb/input/linked.bc.o:(z54ee33ec600f7cd087f01ab97fdc688b) Error: Creating the executable from LLVM IRs failed.
Similar to what Dark Table used to show. I can click OK and continue and it will scrub etc but you can't "DELIVER" the project as it says GPU Error code -1. All thumbnails etc load fine.
@matbonn Do you start DVR (not Studio for me) from progl script or from menu/icon? If I start from menu I have your problem with GPU; if I start from script I have no problem and GPU works. I also have RX5700XT
@trougnouf - Yes I tried DT-git - thanks works fine. However I would like to know the fix as it's unlikely BlackMagic will release an update just for a new version of AMD's opencl? At this point DRS is unusable so I am REALLY hoping 21.30 will work regardless of using DT GIT and current version of DRS.
Did you try darktable-git ? No problem here with a Radeon 56
823.033548 [opencl_summary_statistics] device 'gfx900:xnack-' (0): 30767 out of 30767 events were successful and 0 events lost 823.067198 [opencl_summary_statistics] device 'NVIDIA GeForce GTX 1070' (1): 5319 out of 5319 events were successful and 0 events lost
21.20 Doesn't work with Darktable OR Davinci Resolve Studio 17.2.2. DRS gives a GPU error code: -1. 20.45 worked fine with DRS. Clinfo works but both DT and DRS state they are unable to compile an OpenCL program.
x7f95a7911640 | DVIP | ERROR | 2021-08-08 17:44:13,347 | Failed to build OpenCL program: - Error: CL_BUILD_PROGRAM_FAILURE - Options: " -w -cl-mad-enable -cl-fast-relaxed-math -Dz323df50901b485739bf3a3b9a84c73b0 -Dz6e436e44fad709e7c0aa0046bd091019 -Dzc229ce7b384e9cbe83e58608fba7c36d" - Build log: lld: error: undefined hidden symbol: z1536d0ff067facba3ae1f450c0a9a893
referenced by /tmp/comgr-395724/input/linked.bc.o:(z54ee33ec600f7cd087f01ab97fdc688b) referenced by /tmp/comgr-395724/input/linked.bc.o:(z54ee33ec600f7cd087f01ab97fdc688b) Error: Creating the executable from LLVM IRs failed.
0x7f95e8bff640 | GPU.SingleBoardMgr | ERROR | 2021-08-08 17:44:13,353 | DVIP exception caught: DVIP Exception: Kernel build failure
5700XT
I see that AMD have released 21.30 Maybe that will solve issues?
This package seems to cause a segfault whenever opencl is used on my device using the Ryzen 4700u mobile cpu. Running clinfo causes the display to freeze, but is sometimes able to recover.
Edit: After reading previous comments, a workaround is to downgrade to 20.40. Running versions 20.45 and above still results in a segfault.
Output of clinfo:
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.0 AMD-APP (3261.0)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx902:xnack-
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0
Driver Version 3261.0 (HSA1.1,LC)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) Renoir
Device PCI-e ID (AMD) 0x1636
Device Topology (AMD) PCI-E, 0000:04:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 27
SIMD per compute unit (AMD) 4
SIMD width (AMD) 16
SIMD instruction width (AMD) 1
Max clock frequency 1600MHz
Graphics IP (AMD) 9.0
Device Partition (core)
Max number of sub-devices 27
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple (kernel) 64
Wavefront width (AMD) 64
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 536870912 (512MiB)
Global free memory (AMD) 524288 (512MiB) 524288 (512MiB)
Global memory channels (AMD) 4
Global memory banks per channel (AMD) 4
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 456340272 (435.2MiB)
Unified memory for Host and Device No
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing Yes
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 456340272 (435.2MiB)
Preferred total size of global vars 536870912 (512MiB)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 5686
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 8192 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 16384x16384x8192 pixels
Max number of read image args 128
Max number of write image args 8
Max number of read/write image args 64
Max number of pipe args 16
Max active pipe reservations 16
Max pipe packet size 456340272 (435.2MiB)
Local memory type Local
Local memory size 65536 (64KiB)
Local memory size per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 456340272 (435.2MiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 262144 (256KiB)
Max size 8388608 (8MiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Number of P2P devices (AMD) 0
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 0ns (Wed Dec 31 18:00:00 1969)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No
Number of async queues (AMD) 8
Max real-time compute queues (AMD) 8
Max real-time compute units (AMD) 27
printf() buffer size 4194304 (4MiB)
Built-in kernels (n/a)
Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD]
clCreateContext(NULL, ...) [default] Success [AMD]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx902:xnack-
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx902:xnack-
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)
Platform Name AMD Accelerated Parallel Processing
Device Name gfx902:xnack-
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.3.0
ICD loader Profile OpenCL 3.0
Output of dmesg:
[ 7013.544283] qcm fence wait loop timeout expired
[ 7013.544285] The cp might be in an unrecoverable state due to an unsuccessful queues preemption
[ 7013.544286] amdgpu: Failed to evict process queues
[ 7013.544291] amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
[ 7013.544307] amdgpu: Failed to quiesce KFD
[ 7013.747758] [drm] free PSP TMR buffer
[ 7013.779143] amdgpu 0000:04:00.0: amdgpu: MODE2 reset
[ 7013.779716] amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 7013.779868] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
[ 7013.779898] [drm] VRAM is lost due to GPU reset!
[ 7013.780096] [drm] PSP is resuming...
[ 7013.800294] [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
[ 7013.997929] amdgpu 0000:04:00.0: amdgpu: RAS: optional ras ta ucode is not available
[ 7014.017960] amdgpu 0000:04:00.0: amdgpu: RAP: optional rap ta ucode is not available
[ 7014.017968] amdgpu 0000:04:00.0: amdgpu: SMU is resuming...
[ 7014.019123] amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully!
[ 7014.263805] [drm] kiq ring mec 2 pipe 1 q 0
[ 7014.265292] [drm] DMUB hardware initialized: version=0x01010014
[ 7014.450860] [drm] VCN decode and encode initialized successfully(under DPG Mode).
[ 7014.451177] [drm] JPEG decode initialized successfully.
[ 7014.451188] amdgpu 0000:04:00.0: amdgpu: ring gfx uses VM inv eng 0 on hub 0
[ 7014.451190] amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 7014.451191] amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 7014.451192] amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[ 7014.451194] amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[ 7014.451195] amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[ 7014.451196] amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[ 7014.451197] amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[ 7014.451198] amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[ 7014.451199] amdgpu 0000:04:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
[ 7014.451200] amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses VM inv eng 0 on hub 1
[ 7014.451201] amdgpu 0000:04:00.0: amdgpu: ring vcn_dec uses VM inv eng 1 on hub 1
[ 7014.451202] amdgpu 0000:04:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 4 on hub 1
[ 7014.451203] amdgpu 0000:04:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 5 on hub 1
[ 7014.451204] amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec uses VM inv eng 6 on hub 1
[ 7014.454252] amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow start
[ 7014.454256] amdgpu 0000:04:00.0: amdgpu: recover vram bo from shadow done
[ 7014.454640] [drm] Skip scheduling IBs!
[ 7014.457181] amdgpu 0000:04:00.0: amdgpu: GPU reset(2) succeeded!
[ 7014.457237] [drm] Skip scheduling IBs!
[ 7014.457254] [drm] Skip scheduling IBs!
[ 7014.457254] [drm] Skip scheduling IBs!
[ 7014.457254] [drm] Skip scheduling IBs!
[ 7014.474762] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7014.490486] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7014.507307] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7014.573092] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7015.181995] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7015.434406] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.342239] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.371947] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.403968] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.435421] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.468161] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.501938] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.535602] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7016.557830] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7019.528917] rfkill: input handler enabled
[ 7019.548263] fbcon: Taking over console
[ 7019.579017] Console: switching to colour frame buffer device 240x67
[ 7021.503880] amdgpu_cs_ioctl: 13 callbacks suppressed
[ 7021.504029] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.504970] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.505733] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.506519] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.661422] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.662443] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.662983] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.663611] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.674081] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7021.674609] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7031.489401] amdgpu_cs_ioctl: 10 callbacks suppressed
[ 7031.489695] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7031.490352] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7035.052375] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7035.053048] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7037.551411] audit: type=1334 audit(1627320900.286:216): prog-id=41 op=LOAD
[ 7037.656537] audit: type=1130 audit(1627320900.390:217): pid=1 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 7040.255413] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
[ 7040.256095] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
@RJGarch: This is a darktable issue. And it is fixed in git master already. Current darktable version 3.6.0 only works up to opencl-amd version 20.50.1234664-5
opencl-amd does not work together with darktable.
clinfo:
Number of platforms 1 Platform Name AMD Accelerated Parallel Processing Platform Vendor Advanced Micro Devices, Inc. Platform Version OpenCL 2.0 AMD-APP (3261.0) Platform Profile FULL_PROFILE Platform Extensions cl_khr_icd cl_amd_event_callback Platform Extensions function suffix AMD
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx1010:xnack-
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0
Driver Version 3261.0 (HSA1.1,LC)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) Navi 10 [Radeon Pro W5700]
Device PCI-e ID (AMD) 0x7312
Device Topology (AMD) PCI-E, 0000:0a:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 18
SIMD per compute unit (AMD) 4
SIMD width (AMD) 32
SIMD instruction width (AMD) 1
Max clock frequency 1930MHz
Graphics IP (AMD) 10.1
Device Partition (core)
Max number of sub-devices 18
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple (kernel) 32
Wavefront width (AMD) 32
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 8573157376 (7.984GiB)
Global free memory (AMD) 8372224 (7.984GiB) 8372224 (7.984GiB)
Global memory channels (AMD) 8
Global memory banks per channel (AMD) 4
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 7287183768 (6.787GiB)
Unified memory for Host and Device No
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing Yes
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 7287183768 (6.787GiB)
Preferred total size of global vars 8573157376 (7.984GiB)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 29458
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 8192 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 16384x16384x8192 pixels
Max number of read image args 128
Max number of write image args 8
Max number of read/write image args 64
Max number of pipe args 16
Max active pipe reservations 16
Max pipe packet size 2992216472 (2.787GiB)
Local memory type Local
Local memory size 65536 (64KiB)
Local memory size per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 7287183768 (6.787GiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 262144 (256KiB)
Max size 8388608 (8MiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Number of P2P devices (AMD) 0
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 0ns (Thu Jan 1 01:00:00 1970)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) No
Number of async queues (AMD) 8
Max real-time compute queues (AMD) 8
Max real-time compute units (AMD) 18
printf() buffer size 4194304 (4MiB)
Built-in kernels (n/a)
Device Extensions cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
NULL platform behavior clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [AMD] clCreateContext(NULL, ...) [default] Success [AMD] clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx1010:xnack- clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx1010:xnack- clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1) Platform Name AMD Accelerated Parallel Processing Device Name gfx1010:xnack-
ICD loader properties ICD loader Name OpenCL ICD Loader ICD loader Vendor OCL Icd free software ICD loader Version 2.3.0 ICD loader Profile OpenCL 3.0
darktable -d opencl
0.276847 [opencl_init] opencl related configuration options:
0.276879 [opencl_init]
0.276882 [opencl_init] opencl: 1
0.276884 [opencl_init] opencl_scheduling_profile: 'default'
0.276886 [opencl_init] opencl_library: ''
0.276889 [opencl_init] opencl_memory_requirement: 768
0.276891 [opencl_init] opencl_memory_headroom: 400
0.276894 [opencl_init] opencl_device_priority: '/!0,///!0,*'
0.276897 [opencl_init] opencl_mandatory_timeout: 200
0.276899 [opencl_init] opencl_size_roundup: 16
0.276901 [opencl_init] opencl_async_pixelpipe: 0
0.276903 [opencl_init] opencl_synch_cache: active module
0.276905 [opencl_init] opencl_number_event_handles: 25
0.276908 [opencl_init] opencl_micro_nap: 1000
0.276910 [opencl_init] opencl_use_pinned_memory: 0
0.276912 [opencl_init] opencl_use_cpu_devices: 0
0.276914 [opencl_init] opencl_avoid_atomics: 0
0.276915 [opencl_init]
0.277155 [opencl_init] found opencl runtime library 'libOpenCL'
0.277173 [opencl_init] opencl library 'libOpenCL' found on your system and loaded
0.350606 [opencl_init] found 1 platform
0.350641 [opencl_init] found 1 device
0.350660 [opencl_init] device 0 gfx1010:xnack-' supports image sizes of 16384 x 16384
0.350663 [opencl_init] device 0
gfx1010:xnack-' allows GPU memory allocations of up to 6949MB
[opencl_init] device 0: gfx1010:xnack-
GLOBAL_MEM_SIZE: 8176MB
MAX_WORK_GROUP_SIZE: 256
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 1024 ]
DRIVER_VERSION: 3261.0 (HSA1.1,LC)
DEVICE_VERSION: OpenCL 2.0
0.638738 [opencl_init] options for OpenCL compiler: -w -cl-fast-relaxed-math -DAMD=1 -I"/usr/share/darktable/kernels"
0.638916 [opencl_init] compiling program demosaic_ppg.cl' ..
0.639137 [opencl_load_program] loaded cached binary program from file '/home/raphael/.cache/darktable/cached_kernels_for_gfx1010xnack_32610HSA11LC/demosaic_ppg.cl.bin' MD5: '7bd0bb8e42db27fbd7b2247b9c7243a2'
0.639141 [opencl_load_program] successfully loaded program from '/usr/share/darktable/kernels/demosaic_ppg.cl' MD5: '7bd0bb8e42db27fbd7b2247b9c7243a2'
0.640301 [opencl_build_program] successfully built program
0.640308 [opencl_build_program] BUILD STATUS: 0
0.640311 BUILD LOG:
0.640312
0.640318 [opencl_init] compiling program
atrous.cl' ..
0.640369 [opencl_load_program] loaded cached binary program from file '/home/raphael/.cache/darktable/cached_kernels_for_gfx1010xnack_32610HSA11LC/atrous.cl.bin' MD5: '03010eae88840256f08caf4148cc99de'
0.640372 [opencl_load_program] successfully loaded program from '/usr/share/darktable/kernels/atrous.cl' MD5: '03010eae88840256f08caf4148cc99de'
0.640778 [opencl_build_program] successfully built program
0.640785 [opencl_build_program] BUILD STATUS: 0
0.640788 BUILD LOG:
0.640789
0.640794 [opencl_init] compiling program basic.cl' ..
0.641489 [opencl_load_program] loaded cached binary program from file '/home/raphael/.cache/darktable/cached_kernels_for_gfx1010xnack_32610HSA11LC/basic.cl.bin' MD5: '06bed41b26faa5d9c010dfa4232863ae'
0.641493 [opencl_load_program] successfully loaded program from '/usr/share/darktable/kernels/basic.cl' MD5: '06bed41b26faa5d9c010dfa4232863ae'
0.646168 [opencl_build_program] successfully built program
0.646177 [opencl_build_program] BUILD STATUS: 0
0.646180 BUILD LOG:
0.646181
0.646188 [opencl_init] compiling program
blendop.cl' ..
0.646321 [opencl_fopen_stat] could not open file `/home/raphael/.cache/darktable/cached_kernels_for_gfx1010xnack_32610HSA11LC/blendop.cl.bin'!
0.646326 [opencl_load_program] could not load cached binary program, trying to compile source
0.646348 [opencl_load_program] successfully loaded program from '/usr/share/darktable/kernels/blendop.cl' MD5: '12fda7537d9ccceccb6fe5b2b6372438'
1.715850 [opencl_build_program] could not build program: -11
1.715877 [opencl_build_program] BUILD STATUS: -2
1.715880 BUILD LOG:
1.715882 lld: error: undefined hidden symbol: rgb_to_JzCzhz
referenced by /tmp/comgr-f40f3f/input/linked.bc.o:(blendop_display_channel) referenced by /tmp/comgr-f40f3f/input/linked.bc.o:(blendop_display_channel) referenced by /tmp/comgr-f40f3f/input/linked.bc.o:(blendop_display_channel) referenced 9 more times
lld: error: undefined hidden symbol: get_rgb_matrix_luminance
referenced by /tmp/comgr-f40f3f/input/linked.bc.o:(blendop_display_channel) referenced by /tmp/comgr-f40f3f/input/linked.bc.o:(blendop_display_channel) referenced by /tmp/comgr-f40f3f/input/linked.bc.o:(blendop_display_channel) referenced 1 more times Error: Creating the executable from LLVM IRs failed.
1.715892 [opencl_init] failed to compile program `blendop.cl'! 1.715901 [opencl_init] no suitable devices found. 1.715903 [opencl_init] FINALLY: opencl is NOT AVAILABLE on this system. 1.715904 [opencl_init] initial status of opencl enabled flag is OFF.
@mabod yeap, I also tested master branch and it works OK (I have no idea how to use darktable either) and log is fine:
[opencl_summary_statistics] device 'gfx1010:xnack-' (0): 2092 out of 2092 events were successful and 0 events lost
With the 21.20, geekbench gets to the end of the benchmark indicating an error though:
[0701/090650:ERROR:src/geekbench/workload/compute_workload.cpp(111)] workload 321 failed validation
The final value is 29248 while with previous drivers it was never under 80000. Probably it depends on the error. Blender and DaVinci Resolve work without problems. Clinfo reports the data correctly (small unimportant note: where before it reported gfx1010:xnack+ now it reports gfx1010:xnack-) I have the 5700XT.
There is a new PR #9359 for darktable which fixes the issue with opencl-amd 21.20.1271047-1. I just tested it.
@mabod Thanks I will test it too in the evening when I get access to my PC.
darktable-git contains fixes which make it work with opencl-amd 21.10.1247438-1 but it fails again with opencl-amd 21.20.1271047-1
You might want to include darktable in your test scenario. Just excecute "darktable -d opencl"
I've updated the package to 21.20 - Geekbench is still failing. But I will make more tests with other software.
edit: Radeon ProRender looks to be working (default settings) - [Link]
edit: TeamRedMiner works too (I have no idea how to use it properly)
GPU 0 [67C, fan 45%] ethash: 34.28Mh/s, avg 33.91Mh/s, pool 0.000 h/s a:0 r:0 hw:0
I don't use Arch as my main anymore so I can only update when i ssh into my server. and i haven't properly setup my funny update script for this package on my server yet, so once i'll do that I'll push an update (assuming luciddream doesnt do it first)
Also I will probably not be able to test these as much anymore, unless I adapted this PKGBUILD to an ebuild
Hi all, real life got in the way, I'm preparing a package right now and will update in a while.
Thanks for the new version notice, unfortunately AMD only sends me emails for the Windows drivers. I will try to do a release later tonight (EU) unless sperg does it first.
I have a question regarding compatibility, will this work with an AMD GPU that is unsupported by ROCm?
And is compatibility limited to GPUs that are officially compatible with AMDGPU-PRO, or any GCN GPU should work?
Newest opencl-amd 21.10.... does not work with darktable. darktable is not able to compile filmic_chroma_v2.
seems odd to me that certain bugs only affect certain distributions since they're all just Linux with a different package manager basically. Might be something different with Fedora, I don't know. Either way, this doesn't seem to be affecting Arch, i think luciddream has a 5700XT (unless the schizophrenia is hitting again) and wasn't "affected", and it seems to be working fine (read: broken but doesn't obliterate the system) for most people. Thanks for the warning and taking the time to let us know.
On Fedora systems this is happening on some GPUs not all. 5700xt is for sure and radeon carrizo as reported by few users. So I just wanted to reach out in case this bug also affected Arch Linux. Thanks for the response :)
I can confirm mesa 21.1.1-1
works fine with latest opencl-amd
package. I guess it's not something that affects Arch Linux but thanks for the warning anyway.
@shuriken: I don't have any issues with this, but I'm running an older version of this package (20.40.1147286-1) because the ones after don't fully work with my system.
I have a RX 5600 XT.
URGENT ISSUE because of mesa 21.1.1 release.
Hi maintainers for opencl-amd, I maintain an unofficial Fedora guide for openCL, identical to this repo. We currently have an issue.
This is my guide https://www.reddit.com/r/Fedora/comments/m2il41/guide_installing_opencl_alongside_mesa_drivers/
I have posted a fix here to downgrade mesa from tty for arch ofcourse packages may differ https://www.reddit.com/r/Fedora/comments/njdu38/importantfor_people_who_using_opencl_from_radeon/
For arch linux mesa update just got pushed as I was typing this comment https://archlinux.org/packages/extra/x86_64/mesa/ to 21.1.1 which may result in this issue on Arch linux and will make many systems unbootable and will require a downgrade from tty. please open a bug on https://archlinux.org/packages/extra/x86_64/mesa/ I don't have permission to do that, I tried reaching quicker :(
@luciddream - thanks, you're most probably right, sorry for pointing at the wrong culprit. I can confirm that said fix doesn't work on linux though :) I'll just run 20.50 until it does.
@droidbot1711
It doesn't look to be a bug with the drivers, there has been a new release to address it (0.8.2) - someone says it doesn't work on Linux though.
I'm guessing this is a bug with the upstream AMD drivers, but just wanted to give a heads up that the latest release breaks mining on my build. Somehow teamredminer only detects 20 CUs instead of 40 and fails to initialize the card (even when manually setting the correct CU number). Cloning this repo and building the HEAD~2 commit fixes the issue. Card is Sapphire Nitro+ 5700XT.
@iKevin shouldn't be necessary. Since they're so out-of-date, I don't even know if it's any different than the ones in extra. Later I'll probably see if I can diff the one in the tarball and an equally old version of the "official" headers. No promises for when or if I'll do it, like I said I'm good at lying.
The latest one works fine here with RX570 card, clinfo returns all the right parameters as well as Darktable.
Hi @sperg512. I should be more clear. By "off to the races" I meant I can get going. I haven't tried to compile yet. Also, as you pointed out, I noticed the opencl-headers in the regular repo and they are out of sync with this package.
Perhaps there could be a separate AUR package with the c/c++ headers called opencl-amd-dev that depends on this package.
I guess it compiles because it doesn't really need it. The headers are already included in the project.
I just compared opencl-headers
package with the one from amdgpu-pro package, it seems that amdgpu-pro CL files are from 2019, and has no support for OpenCL 3.0.
So I'm not sure if there is any benefit in including them with opencl-amd
Apparently I'm really good at lying because I forgot to test last night, anyways here's my results:
geekbench
acts really weird, with >=2 windows open, it freezes most stuff and my entire display just completely spergs out. With 1 window open though, it just exits (status 255)
Blender and LibreOffice with OCL on just segfault
Detected by clinfo
just fine
@hpohl I couldn't really figure anything of significance out from that. I think I'll send that to the devs to see if they can.
@iKevin @luciddream OCL headers are in opencl-headers
package, with C++ stuff available in opencl-clhpp
. I have no clue how you got that to compile (maybe you have one of those installed?), but either way maybe we could add those as optdeps or something.
@apaz That's normal. Seems to be doing that for pretty much everyone's.
Thanks @luciddream. You solved the problem in a different way. I was trying to build it and cmake was complaining that it couldn't find opencl due to the includes files not being found. I am off to the races!
@iKevin I made some tests, although keep in mind I've never worked with C or C++. I'm not sure if the header files are necessary.
All I did was add ${CMAKE_DL_LIBS}
to target_link_libraries, ran ccmake .
, set OpenCL_LIBRARY=/usr/lib/libamdocl64.so
, set OpenCL_INCLUDE_DIR
to something random, generate makefile, and ran make
.
Then it looks that it is identifying the GPU correctly. I have no idea how to run this though.
luciddream@arch ~/p/VerthashMiner (main)> ./VerthashMiner -l
[2021-04-21 20:43:09] INFO Found 1 OpenCL devices.
Device list:
==================
OpenCL devices:
Index: 0. Name: gfx1010:xnack+
Platform index: 0
Platform name: Advanced Micro Devices, Inc.
pcieId: 0d:00:0
Hi @luciddream. My use case would be to compile a local copy of verthash-miner.
Everything is fine for me with the new drivers (RX 5700XT). The only change is that clinfo reports "Device Name gfx1010:xnack+" while previously it reported only "gfx1010". I don't know what that means.
@iKevin we actually don't install the header files at the moment. But there are some header files inside the archive, and we can make tests to see if they are useful for OpenCL / HIP compilation. Do you have a specific case in mind? I was thinking using something like Pytorch for ROCM as a test project, but that's for HIP.
Hi
I have a very basic question. Where do the header files get installed for building opencl packages?
I'm happy to see 21.10 works normally as 20.40 with my RX6800.
@hpohl try LD_DEBUG=ALL clinfo
and put the result in a file
I'll do some testing in a little bit, I just finished working out and now I've got dinner. Once that's done, I'll try a few things like blender, libreoffice, geekbench, and a basic OCL program.
I don't think this update should really change much for OCL, seems to just be a Vulkan update. Regardless, do let us know if this doesn't work--especially Polaris so I can let the devs know if that problem's still there
Hey, I noticed the comments very late at night so I couldn't do the research myself. I've copied the changes from @nullik - apart from 2 lines. Please comment if it's not installing for you.
From a performance / working status, I don't think it is any different to the previous version. Geekbench is still failing for me.
Thanks @nullik! Still not detecting my RX 6700 XT though:
Platform Name AMD Accelerated Parallel Processing
Number of devices 0
Updated PKGBUILD with 21.10 version driver https://pastebin.com/BRyiLrUi
@lucciddream No problem, I'll stick to 20.40. Happy to test when a new version comes out :)
@hpohl If 20.40 works for you but 20.50 doesn't, my guess is that ROCM might not be supporting 6700XT yet. But I would wait for more people to comment and verify that. AMD promised zero day support for ROCM and RDNA2, and that was my main motivation when I updated the package, but I'm not sure they have been able to keep that promise.
Blender does not list my RX 6700 XT as an OpenCL device.
@ATrigger and @quimkaos can you read the pinned comment? Need some info from Polaris card users so yea
@hpohl geekbench5, Blender (with OCL turned on), LibreOffice (with OCL turned on)
@sperg512 Do you have an example of a game or benchmark that uses OpenCL? Preferably free :)
I've just installed 20.40 and it works:
$ clinfo
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx1031
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0 AMD-APP (3180.7)
Driver Version 3180.7 (PAL,LC)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) Unknown AMD GPU
Device PCI-e ID (AMD) 0x73df
Device Topology (AMD) PCI-E, 0000:0b:00.0
Device Profile FULL_PROFILE
...
$ ldd -v /usr/bin/clinfo
linux-vdso.so.1 (0x00007ffdb07aa000)
libOpenCL.so.1 => /opt/cuda/lib64/libOpenCL.so.1 (0x00007f3c5cfdf000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f3c5cfd8000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f3c5ce0b000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f3c5cdea000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f3c5d21e000)
Version information:
/usr/bin/clinfo:
libdl.so.2 (GLIBC_2.2.5) => /usr/lib/libdl.so.2
libOpenCL.so.1 (OPENCL_1.0) => /opt/cuda/lib64/libOpenCL.so.1
libc.so.6 (GLIBC_2.3) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.7) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.14) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /usr/lib/libc.so.6
/opt/cuda/lib64/libOpenCL.so.1:
libdl.so.2 (GLIBC_2.2.5) => /usr/lib/libdl.so.2
libpthread.so.0 (GLIBC_2.2.5) => /usr/lib/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
/usr/lib/libdl.so.2:
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
/usr/lib/libc.so.6:
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
/usr/lib/libpthread.so.0:
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.32) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /usr/lib/libc.so.6
Also when I upgraded from 20.40 to 20.50:
Packages (1) opencl-amd-20.50.1234664-5
Total Installed Size: 243.56 MiB
Net Upgrade Size: -73.44 MiB
And now with clinfo 20.50 (1234664):
$ ldd -v /usr/bin/clinfo
linux-vdso.so.1 (0x00007fff379ae000)
libOpenCL.so.1 => /opt/cuda/lib64/libOpenCL.so.1 (0x00007f62fb4ab000)
libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f62fb4a4000)
libc.so.6 => /usr/lib/libc.so.6 (0x00007f62fb2d7000)
libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f62fb2b6000)
/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f62fb6ea000)
Version information:
/usr/bin/clinfo:
libdl.so.2 (GLIBC_2.2.5) => /usr/lib/libdl.so.2
libOpenCL.so.1 (OPENCL_1.0) => /opt/cuda/lib64/libOpenCL.so.1
libc.so.6 (GLIBC_2.3) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.7) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.14) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.3.4) => /usr/lib/libc.so.6
/opt/cuda/lib64/libOpenCL.so.1:
libdl.so.2 (GLIBC_2.2.5) => /usr/lib/libdl.so.2
libpthread.so.0 (GLIBC_2.2.5) => /usr/lib/libpthread.so.0
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
/usr/lib/libdl.so.2:
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_PRIVATE) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
/usr/lib/libc.so.6:
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_2.3) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
/usr/lib/libpthread.so.0:
ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /usr/lib64/ld-linux-x86-64.so.2
ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /usr/lib64/ld-linux-x86-64.so.2
libc.so.6 (GLIBC_2.14) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.32) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.3.2) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.4) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_2.2.5) => /usr/lib/libc.so.6
libc.so.6 (GLIBC_PRIVATE) => /usr/lib/libc.so.6
@hpohl run ldd -v /usr/bin/clinfo
and paste the output, maybe some library is missing for 6700 XT.
@hpohl that's interesting. Do apps/games that use OpenCL crash without DRI_PRIME=1
? If so, then get one to crash and attach a crash log of that game and the most recent (or any relevant) dmesg lines.
My RX 6700 XT is not detected by clinfo (with our without --ofline):
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.0 AMD-APP (3188.4)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
Platform Name AMD Accelerated Parallel Processing
Number of devices 0
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) AMD Accelerated Parallel Processing
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) No devices found in platform [AMD Accelerated Parallel Processing?]
clCreateContext(NULL, ...) [default] No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No devices found in platform
ICD loader properties
ICD loader Name OpenCL ICD Loader
ICD loader Vendor OCL Icd free software
ICD loader Version 2.2.14
ICD loader Profile OpenCL 3.0
Also there is a new build (1234664) available, but it did not change anything. And so did downgrading to 20.45.
Running games with DRI_PRIME=1 works just fine.
0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 (rev c1)
Could not find anything relevant in dmesg.
Tell me if you need any more info :)
@luciddream weird, could it be your git config? Long while ago I had issues with that, but I'll push that tomorrow morning
Anways, anyone with a Polaris card that doesnt work on this version, send outputs of the following commands preferrably in files:
clinfo
clinfo --offline
dmesg
, include relevant lines only if possible
Please also, if possible, include a crash log of any application using OCL that crashes on this version (i.e Junk Shop blender scene, Geekbench, or even just a basic OCL program). Any extra information would be appreciated. You can include versions of things like the kernel if you want
I need this so I can give the driver devs as much info as possible to fix this bug. In the future, when any major bugs like this occur, I will likely refer back to this comment.
also if there's any more information that should be necessary for everyone to send please tell me! Thanks
I still can't push packages here, AUR doesn't recognize my SSH key for some reason. I've updated the PKGBUILD to use the latest version, maybe it will help some people.
Link for Download - it's just a version change + file hash.
Geekbench5 fails to run with this version for my 5700XT.
20.45 produces really low OpenLC compute power with Vega56, does anybody has some info about the 20.50 version? I even tried to checkout the older 20.40 version but that case only the superuser was able to find the OpenCL devices with clinfo.
@merlock RX 400s and 500s are Polaris. you can see it says POLARIS11
after the name.
and if it's working for you then that's great, no need to downgrade
@sperg512 - is there some kind of Polaris versioning?
OpenGL: renderer: Radeon RX 560 Series (POLARIS11 DRM 3.40.0 5.11.8-arch1-1 LLVM 11.1.0) v: 4.6 Mesa 20.3.4
Current version is working fine for me (F@H is plugging right along).
ok so as @ATrigger and @quimkaos pointed out, 20.50 doesn't work on Polaris GPUs. If you have one, downgrade to 20.45 or install opencl-amd-polaris. could also be Mesa being fat but for now just downgrade or install the other package ok thanks
OpenCL 20.50 is borked on RX580. Don't upgrade
I was using this package with a Polaris card (RX 580 8GB) and this stopped working with the update from 20.45.1164792-4 to 20.50.1232447-4. I moved to the AUR package opencl-amd-polaris 19.50.967956-1. it's a lower version but it's working. So if anyone is facing the same problem, you can downgrade to 20.45 or use the AUR opencl-amd-polaris. thanks for the package luciddream!
We don't need to spam AUR with releases. It's just a fix for Ubuntu, it's already working in Arch Linux. imo we should wait for the next major release from AMD.
@kode54 flag it out of date when that happens, I'm more likely to check my email and actually respond when it's an out of date notification than a comment. Will do tomorrow as I'm in bed rn
@kode54 is there a guide somewhere on how I can use the new build before this package is updated? Or is it a bit of a case of try and see?
New build, 20.45.1188099
, sha256sum a4040db7822cde36c0783912428e1b4897ecdacb9b3d21d716357dae6e4fc6b7
.
@sperg512 Thats is true, with Nvidia you can't use recent kernels, not realtime kernel either, you must use only distro kernel and nvidia packages from distros.
I used to had a nvidia gxt 550ti for almost 9 years (with ubuntu, debian, manjaro). and I never had a problem IF and only IF used distro packages.
I have now a AMD RX 5500 XT and I can install any kernel, a nice boot image, 3d, vulkan, etc. Out the box. But if we need some propietary stuff.... well here we are :)
@Kode54
I don't recommend switching cards just yet. My experience with the 5700xt is it took a year for the drivers to be stable enough for desktop use. Now it seems that they are also working on openCL and deep learning, but I don't think there will be something stable and complete before another year. In any case I don't recommend the RDNA series because it will be neglected in favor of RDNA2 and Vega. Better to try to get Polaris running and wait for the change.
@recompiler
Opencl-amd and Blender stable work for me without problems!
@kode54 Keep in mind you need to be using the git version of Blender if you're testing OpenCL rendering. The stable version has yet to be patched.
aaaaand here starts the inevitable argument about Nvidia vs. AMD
All of my friends who use Nvidia have had a load of problems on recent kernels, from X not starting to it straight up not working (remember the kernel 5.9 stuff? lol), if they're apparently not that bad then I guess my friends are just dumb
don'tgo the Nvidia route though, because Nvidia doesn't know how to make a functioning Linux driver (even less than AMD lol)
Absolute nonsense, what's more ironic is that you're claiming that on a the thread about actual, literal AMD driver issues.
@kode54 you can grab a 5500XT for around $200-250, or if you're willing to pay more you can get a 6800 for about $570, those are my recommendations
don'tgo the Nvidia route though, because Nvidia doesn't know how to make a functioning Linux driver (even less than AMD lol)
Asus ROG Strix RX 480 O8G, does not work with any known OpenCL driver for every purpose.
19.50 sort of works with heavy Blender renders, but nothing else useful. Latest just crashes Blender.
Mesa Clover driver crashes the entire desktop session, producing garbage screens with nothing but the garbage'd cursor and two alternating pages of garbage screen contents. Eventually crashes back to GDM, then manages to log back in again.
Maybe it's time for me to buy a new GPU and replace this one, but with what? Nothing else is guaranteed to work, either. Except maybe Nvidia, but then I have to discard my macOS installation.
@Ashark ok will do when home
@232.7celsius good to hear! Thanks for letting us know that it does work and actually might just be an upstream issue, I was scared for a while that it was a packaging issue
@luciddream, Ups, i'm sorrry.
The gpu is PowerColor AMD Radeon RX 5500 XT OC 8GB GDDR6.
Main application is blender renders and foldingathome client. Both working finally
Best!
@451farenheit it's good to hear some good news - since many people have issues with this version. From what I understand AMD 5700(xt?) will soon have full rocm support so I hope most problems will be eliminated with the next version. (or maybe the package as well). Make sure to also include what GPU you own when commenting (an issue or a success)
Hi there!
Just posting to congratulate the person taking care of this package. I installed it months ago with no success. Something was broken either in my pc or in the package. I never found out (booting the pc with opencl-amd started would always yield a black screen. Unstalling it solved the issue).
Today I tried to install it again to check my luck. Everything works flawlessly.
Thumbs up! And thanks a lot btw.
Best
indeed for mining 20.45 is always segfaulting and 20.40 is working, maybe a separate package name for version 20.40 would be useful
@sperg512 I am maintaining amdgpu-pro-installer, and I removed opencl part from it (to avoid duplicate work). So you can remove opencl-amdgpu-pro-* from conflicts and also from provides. Also, amdgpocl can be removed. It was the name of this package very very long time ago. You can see comments history. So seems only opencl-driver is needed in provides.
@luciddream Yep, I've tried installing and uninstalling a load of different libraries, none of which seemed to help. LibreOffice once again crashes with OpenCL enabled (like before), but GeekBench works perfectly. I could test Blender but there's a known page fault so I won't. When I feel like coding again (probably by tomorrow) I can see if openCL even works at all with some small sample programs.
@sperg512 Maybe a long shot but do you have ncurses installed ? It contains a library that might be needed.
If that also doesn't work, try to strace -f clinfo 2>strace.txt
- it might reveal more things about the issue.
@luciddream yup, I have no idea what's happened. Absolutely every variable I can think of is exactly the same--BIOS settings/version, GPU, PCI bus layout, etc etc. It might be because of the recent GCC/glibc updates. But if it's working for you yeah I've got no clue. I haven't tested any other OpenCL applications though, so when I get on pc again I'll try a few
@gsus so one thing that is different is gcc-libs. Arch Linux gcc-libs is version 10.x while on Manjaro is 9.x - It may be caused by that.
@sperg512 didn't the package use to work for you? It still works fine on my PC.
@luciddream Sorry about that I forgot to mention. I use Manjaro 20.2.1
The diference of size if for the base 1024 (MiB) vs 1000 (MB)
Speaking of distroes, I just booted into my Arch install and yeah, segfaults just like my Artix install. lddebug here: https://sperg.funny.cl/cdn/arch_lddebug_2.txt
Same message in dmesg log, same segfault at the exact same offset, during what I think is linking/binding ios_base to libamdocl64.so
. --offline
doesn't help, either.
The thing is that it's not easy to support every distro out there. And if someone has issues with a different distro they need to explicitly mention it. As far as I know it's working fine in Arch Linux.
@gsus are you using Arch Linux or a different distro? I notice your installation size is
Tamaño instalado : 238,8 MB
while mine says
Installed Size : 227.69 MiB
edit: Which is probably the same size after rounding, so probably nothing to worry about.
For some reason, with this, clinfo just segfaults on my Artix install. Luckily, I was able to get the LD_DEBUG output (https://sperg.funny.cl/cdn/lddebug.txt), but it still segfaults even then. Seems like it's loading all the correct libraries. ICYW, here's the lddebug.txt from a chroot into my Arch install: https://sperg.funny.cl/cdn/arch_lddebug.txt (could boot into it directly and get it but cba rn)
The error fish gives specifically is:
fish: “LD_DEBUG=all clinfo 2> lddebug.…” terminated by signal SIGSEGV (Address boundary error)
So yeah, I don't know what's going on. However, one thing I'm particularly worried might be the problem is that nothing in rocm-device-libs
archive gets installed. Though, I tried installing it to the corresponding directory but that didn't seem to work. Which means it's probably something else, or something with my Artix install. But I've got all the necessary stuff installed, numactl, ocl-icd, etc...
Also @gsus those are warnings not errors. I also get those for all of the libambd* libraries, on both my Artix and Arch installs, so you don't need to set the execution bit, in fact, you usually shouldn't do that for a library, and it wouldn't really do much since it's basically just a bunch of symbols and function definitions.
edit: The last line of the lddebug.txt
says this: binding file /usr/lib/libamdocl64.so [0] to /usr/lib/libstdc++.so.6 [0]: normal symbol '_ZNSt8ios_baseD2Ev'
I can't tell if this specifically is crashing it, but it might be. Looks like it's trying to link/bind ios_base
or something.
edit2: dmesg log says this: [ 5148.574644] clinfo[24744]: segfault at 8 ip 00007f33d50462b4 sp 00007fffa86b0f90 error 4 in libamdocl64.so[7f33d4faa000+db000]
So the error is in libamdocl64.so
. Maaaayyybeee a missing library? I'd at least expect that to say "failed to open shared object" or something.
Directly after this message, it also says: [ 5321.488236] Code: 3b 44 24 08 73 08 89 5c 24 0c 89 44 24 08 8d 43 01 48 8b 55 00 48 89 c3 48 3b 04 24 72 a8 8b 44 24 0c 48 8d 04 40 48 8d 14 c2 <48> 8b 42 08 48 8d 0d 81 f5 07 00 4c 8b 0a 49 89 87 f0 05 00 00 48
which I don't understand in the slightest. Clearly it's some hex, maybe the <48>
is like a bad instruction or byte or something, since it's the only one in <>? But there ARE other 48
's there, so I've got no clue...
edit3: That last message is some hex in the libamdocl64
library. Specifically, it starts at offset 0x0B128A, and the <48>
thing is at offset 0x0B12B4. That's the only place it occurs. So yeah, I think it's the area where it's segfaulting. Later, I might try to grab the assembly and see what's there? Maybe that'll help? Who knows
edit4: Here's the assembly file: http://sperg.funny.cl/cdn/libamdocl64.asm
Also, here's a snippet from that for 0x0B128A to 0x0B12B4:
b128a: 3b 44 24 08 cmp 0x8(%rsp),%eax
b128e: 73 08 jae b1298 <clCreateContextFromType@@OPENCL_1.0+0x3ac28>
b1290: 89 5c 24 0c mov %ebx,0xc(%rsp)
b1294: 89 44 24 08 mov %eax,0x8(%rsp)
b1298: 8d 43 01 lea 0x1(%rbx),%eax
b129b: 48 8b 55 00 mov 0x0(%rbp),%rdx
b129f: 48 89 c3 mov %rax,%rbx
b12a2: 48 3b 04 24 cmp (%rsp),%rax
b12a6: 72 a8 jb b1250 <clCreateContextFromType@@OPENCL_1.0+0x3abe0>
b12a8: 8b 44 24 0c mov 0xc(%rsp),%eax
b12ac: 48 8d 04 40 lea (%rax,%rax,2),%rax
b12b0: 48 8d 14 c2 lea (%rdx,%rax,8),%rdx
b12b4: 48 8b 42 08 mov 0x8(%rdx),%rax
meaning that if my "theory" is correct, mov 0x8(%rdx),%rax
is segfaulting it. Fish mentioned address boundaries, so it might trying to be moving a value into an unallocated RAM address or something, or one it can't access?
the version of the package is 20.45.1164792-3
the errors of the ldd -v /usr/lib/libamd*
ldd: atención: no tiene permiso de ejecución para /usr/lib/libamd_comgr.so ldd: atención: no tiene permiso de ejecución para /usr/lib/libamd_comgr.so.1 ldd: atención: no tiene permiso de ejecución para /usr/lib/libamd_comgr.so.1.7.0 [...] (in english is "you don't have execute permission to")
Maybe I need to set the execution bit to the libraries?
here is the output.
@gsus what version of the package do you use? I don't see it trying to load libamd_comgr.so.1 at all.
edit: also try a ldd -v /usr/lib/libamd*
and see if it gives errors.
Hi @luciddream here is the lddebug.txt
Hi @gsus, can you run LD_DEBUG=all clinfo 2> lddebug.txt
and upload it somewhere so we can see if something is missing from the package?
@gsus Downgrade this package to version 20.40, and stay there forever.
For my RX5500 XT clinfo shows No devices found in platform [AMD Accelerated Parallel Processing?] opencl-amdgpu-pro-pal was works but the package opencl-amdgpu-pro-pal not exist anymore
Makes sense. The issue is that PhoenixMiner (and I assume Claymore too) now fail to work, immediately complaining that they couldn't build the OpenCL program, e.g. "Failed to build program: clBuildProgram (-11)" and such. Works fine with 20.40/PAL.
There's nothing odd about it, the new version uses ROCM (HSA - Heterogeneous System Architecture) stack - while the old driver was using PAL (Portable Abstraction Layer)
What issues do you have with the new version? If it doesn't work you can still download the old PKGBUILD from the sticky comment and use that one.
edit: Also check this comment from bridgmanAMD - he is more qualified to answer all questions about this version.
This broke things for me, oddly enough it's showing a lower OpenCL version than the previous version did (2.0 vs 2.1). The driver version doesn't list PAL anymore either which I thought it did before. This one shows - 3188.4 (HSA1.1,LC) where as the 20.40 PAL driver shows 3180.7 (PAL,HSAIL).
Something changed AMDs side, or a change of build process here?
The last version of this package to work on Polaris GPUs is 19.50-5. ROCm officially dropped support for Polaris anyway.
@Recompiler ROCm packages are included in the tarball download. Also, the documentation says that PAL has been replaced by ROCm or something like that, so... idk
@sperg512 Where on AMDs webpage for amdgpupro does it say it includes ROCm?
@sperg512 @Recompiler ok, so slighly contradicting answers, but in the end it does not matter too much. We will see in the long run which implementation becomes more popular. For the record, and as I have mentioned in the comments at rocm-opencl-runtime
, one can add the arch4edu repo and get pre-built binaries of ROCm, without the need to compile. This would allow for faster comparison.
@coxackie This package (opencl-amd) only has the amdgpupro package from AMDs website. It's not connected to ROCm.
ROCm is officially on Github located here https://github.com/RadeonOpenCompute/ROCm
The difference between the two is complicated but you can just say that the amdgpupro stack is the proprietary on by AMD while ROCm is the open source one. They're being worked on separately. Kinda like proprietary Java vs OpenJDK.
@coxackie This is providing ROCm, which is basically the newer version of PAL. So yes
I am a bit confused with this ROCm talk. Let me ask: is this package providing the Pal version? and rocm-opencl-runtime
the ROCm one? Or is this one also providing ROCm, but proprietary, as opposed to open-source?
My bad. I don't know why I said this packages amdgpupro with rocm (had to double check myself). What I meant is that opencl from the amdgpupro driver is pretty much worthless imo. I really do think rocm is going to succeed it in every way.
So the equivalent of my setup on Gentoo for y'all would be to install mesa and rocm -> https://aur.archlinux.org/packages/rocm-opencl-runtime/ (or compile it manually like I did). Try that without the amdgpupro stuff and see how it goes. Get the other guy that had issues with amdgpupro to try it too.
Yup, that's what this PKGDESC says. I don't really know anything else that would work better, because like I said, this is a graphics driver on an unsupported Linux distribution. Shit's gonna go wrong.
Maybe if I wrote some strongly worded emails to AMD then they could actually distribute an Arch package. Anyone else volunteer?
Correct me if I'm wrong but I think I know why I had problems with this aur package. You're combining two different versions of opencl implementations, the proprietary amdgpupro stack and the newer open source rocm stack. I've recently been testing rocm with mesa on Gentoo and it seems to be pretty decent. It's a lot better than amdgpupro.
@jiweigert Your CPU is supported just as any other, but ROCm only supports Vega 10 and up. Since some of the issues we had were fixed (clinfo segfaulting), some more things should work for you. Who knows though, this IS a graphics driver, this IS a new one released for new hardware, and this IS Linux, let alone one not even officially supported by the driver.
Any update on the state of this package in terms of supporting Picasso CPU / Vega 8 APUs?
Actually, I fixed it. In the file ~/.config/libreoffice/*/user/registrymodifications.xcu
, you'll find an entry that says "UseOpenCL". I set it to false and it worked.
@coxackie Nope, just straight up crashes with an "application error" when I open anything. Even just LibreOffice. Tried updating it, still didn't work. Guess I've got to stick to OpenOffice or something for now...
Unrelated with the package but I also had issues with Libreoffice (it's crashing on copy paste on wayland), so I removed it and I'm using wps-office flatpak now which is much faster.
@sperg512 you need OpenCL for an English test? Presumably you mean you need LibreOffice. Can you not open the basic "Libreoffice" frontend and disable OpenCL there? (Alt-F12 for Options, go to OpenCL, disable).
Little extra note, for me LibreOffice doesn't launch with OpenCL on. Wish I hadn't tested it now, because I need it for my English exam which I'm working on currently...
ALSO now clinfo doesn't wanna work. Welp...
Just wanted to update everyone on my findings. I switched to gentoo over the past week and tried out proprietary opencl for my vega 10 and well, it crashed blender as soon as i opened the properties menu, and clinfo still seg faulted.
Step 2 was obviously to test opencl in windows to see if it ever worked to begin with. To my surprise it worked, but it was really slow. Switching back to the cpu greatly improved the speed by over 10x. I don't know why it's like that but maybe the gpu just wasn't optimized enough for it, it is a low-end igpu after all.
Interestingly even cpu rendering in blender on windows proved to be 25% faster than cpu rendering was in linux.
My final thoughts: opencl can work if you're willing to jump through a bunch of hoops but expect low-end iGPUs to be outperformed by the CPU. Also for serious rendering/encoding just get a Nvidia GPU, cuda is way better and a whole lot easier to get working. I plan to send my large renders to my server to use my cuda cores to render with. But at least I have a okayish setup on my laptop for light renders by using opencl with my cpu.
I hope anyone else with a Ryzen 7 3700U and a Vega 10 iGPU can find this helpful. If you have questions feel free to contact me at vectorflaredesigns@gmail.com
@macgeneral oh, I thought it meant for the entire thing. will remove from depends
please remove linux from the depends for those who run alternate kernel packages.
linux>=5.9 пришлось исправить на linux-zen>=5.9 ...
@sperg512: regarding the kernel>=5.9 requirement: https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-45
If I interpret it correctly, you can either use the AMD provided version (supports CentOS 7.9 / Kernel 3.10.0) or you can use the upstream version (oss amdgpu kernel module) which should be equivalent with Kernel >= 5.9 (in combination with Mesa >= 20.2 and llvm >= 11.0).
I found this note on the repo for ROCm: "Note: The integrated GPUs of Ryzen are not officially supported targets for ROCm." Not sure if it's related or not.
Also, a Redditor suggested I try the 20.40 version of opencl-amd so I did and it works better but I still have issues with blender when I try to render using opencl on the gpu, it says "kernel split error failed to load kernel_path_init."
clinfo https://pastebin.com/9jKJVrDd blender https://pastebin.com/jfZiXSfz
I can confirm that OpenCL is up and running now (for example, hashcat
works fine). There is one more error appearing at clinfo
- it is
Number of P2P devices (AMD) 0
P2P devices (AMD) <printDeviceInfo:147: get number of CL_DEVICE_P2P_DEVICES_AMD : error -30>
Not sure if @luciddream can see anything else missing in the debug dump I gave earlier. Anyway, this error does not seem to have an effect, at least for now.
Oh yeah and I went to blender, then Edit -> Preferences -> System -> Cycles Render Devices -> OpenCL. That crashed it with a segfault. Oddly enough, same with the Junk Shop scene - it also segfaults, confirming that was just a driver issue.
@luciddream yeah there's a Wiki page for it but cba to find it. I'll also add linux>=5.9
to depends
because like I said the release notes say it requires Kernel 5.9 or newer, and of course numactl
.
Great team effort everyone ! :) It will take me a while to find out how to commit the package, so maybe @sperg512 can do it before I figure it out.
@luciddream what do you know? after installing numactl
, all of a sudden clinfo
happily shows my card is detected. good work! please add to dependencies and update pkgrel, for others.
The first thing I noticed from the file, is that libnuma.so.1
is probably missing.
This is included in numactl
package. I have that installed on my PC because of libvirtd/qemu.
@coxackie can you give it a try and install it? I will continue looking for more stuff in the file.
@luciddream here. Not sure if the Manjaro comment was for me, but I am using pure Arch.
By the way if you are using Manjaro please include that in the comment. I can see that amdgpu-pro drivers are using libdrm_amdgpu.so.1 library which is part of libdrm. This library is not updated in Manjaro as far as I can tell.
@coxackie sorry to make you run it again but I can't use this file to compare because it's not text even if I copy paste it for some reason.
please run LD_DEBUG=all clinfo 2> lddebug.txt
and upload again
By the way, heres some stuff I got:
clinfo --offline
output: https://pastebin.com/VbY0wFVL
LD_DEBUG=all clinfo
output: https://u.pcloud.link/publink/show?code=XZFISRXZikjfQs5X5CYl0O31U3dfuQkOTwj7
rocm-smi -a
output: https://pastebin.com/jJ6TFfL6
Hashcat seems to work, and so does Geekbench. You can see the geekbench resutls here if needed
@luciddream please see here. Tell us if you can get something out of the mess.
At this point may as well just lend me a PC that ISNT working with this, and then I could debug it there lol
Also, I unarchived and decompressed all of the deb files and their data, and couldn't find anything else related to OpenCL but maybe someone else can help with that
@coxackie can you try to run
LD_DEBUG=all clinfo
and post it to pcloud? I think this might show us what library is missing.
One thing I've noticed is that I get the 0 number of devices @coxackie is getting if I remove libhsa-runtime64.so.1.2.0
or libhsakmt.so.1.0.6
from the package.
So maybe there is a library from the archive we are currently missing. I will try to figure it out later today but it's not easy for us to fix it because it's already working for us. Probably someone that has the issue needs to step up and find the problem.
Thanks working again :)
@luciddream Except for a couple of errors (Unable to display current fan speed, Unable to display PowerPlay table) is appears that rocm-smi -a
has OK output. Full log here.
EDIT: OpenCL certainly not working with LibreOffice (Calc); neither with hashcat
(clGetDeviceIDs(): CL_DEVICE_NOT_FOUND
)
@coxackie can you try to install rocm-smi
and see if rocm-smi -a
is working for you ? Maybe we can use that to find why it's not working for some people.
Also please try to run OpenCL software like Darktable / Geekbench / Hashcat and see if it's working, even if clinfo
is not.
Also it would be nice if more people that opencl-amd
works fine for them comment here. I still think this package should stay with upstream so 6800xt users can benefit from using it.
Use this PKGBUILD. Then just makepkg -si
.
Might need to remove the 20.45 package first, as I don't remember pacman liking downgrades.
@sperg512: Thanks, but how can I get back to 20.40 in Manjaro?
A commandline or advise for option in PKGBUILD to change would be nice :-)
@jiweigert Revert to 20.40 as we stated in the pins, there's clearly been a lot of issues for this driver even if people like me, @luciddream, and @apaz don't have them. OpenCL might still work, but we do know some people's cards aren't being detected and that blender causes segfaults for most of us.
I thought this wasn't a driver issue at first but now that I see so many people having issues with it I think it might actually be. I think someone's gonna contact one of the ROCm developers on Phoronix to seek some support.
Hi,
with the change to last version of opencl-amd (20.45) Ryzen 5 3500U Vega 8 buildin AGPU (gfx902) is not supported anymore.
No device is found on the system anymore.
This is truly a regression as the driver worked so far good before.
What kind of options do I have now?
Kind regards
@luciddream With ls /usr/lib/libamd*
I get
/usr/lib/libamd_comgr.so /usr/lib/libamdhip64.so.1 /usr/lib/libamdocl-orca64.so
/usr/lib/libamd_comgr.so.1 /usr/lib/libamdhip64.so.1.5.19245 /usr/lib/libamd.so
/usr/lib/libamd_comgr.so.1.7.0 /usr/lib/libamdocl12cl64.so /usr/lib/libamd.so.2
/usr/lib/libamdhip64.so /usr/lib/libamdocl64.so /usr/lib/libamd.so.2.4.6
and ls /usr/lib/libtinfo*
gives /usr/lib/libtinfo.so /usr/lib/libtinfo.so.6
.
Anything unusual?
@coxackie can you try to list some libraries on your system
la /usr/lib/libamd*
and la /usr/lib/libtinfo*
?
Maybe you need to make a symlink to libtinfo.so
for some reason.
cd /usr/lib/
sudo ln -s libtinfo.so libtinfo.so.5.9
@luciddream - no idea where the extension comes from. But, as you can see from the output, opencl-mesa
is not installed now (otherwise there would be 2 platforms). I did install opencl-amd
with opencl-mesa
uninstalled, but no difference.
EDIT: also, no ROCm here.
@coxackie I'm not sure where you get that extension from. Can you try to remove opencl-mesa
and then reinstall opencl-amd
package - if you haven't tried that in that order? My platform does not have the offline extension.
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.0 AMD-APP (3188.4)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback
Platform Extensions function suffix AMD
@luciddream Thanks - I tried AMD's clinfo, but there was no illuminating info. I then checked the output of clinfo --offline
(arch clinfo
package - showing all potential supported devices), and I got this. Looking through the platforms, it appears that Navi is not supported? (Presumably I should see gfx1010
somewhere there? But nowhere to be found). This looks like the same issue that opencl-mesa
has.
Of course, the AMD webpage does mention support. I have all requirements - llvm 11, kernel 5.9.9, mesa 20.2.2.
Could anyone else check the output of their clinfo --offline
?
opencl-amd
follows the AMDGPU-Pro upstream releases. If the latest version is not working for your system, you can download and install the 20.40 version. The link to the 20.40 PKGBUILD is here
From AMDGPU-Pro 20.45 release notes - Blender 2.90.1 users may experience page faults (details in dmesg log).
hey guys, we must be in different timezones because I'm usually sleeping when the batch of these emails arrive :D
Blender is not a good example to test the package, since it's a known bug of the driver. Of course it might work for some people but we are just packaging the drivers, we can't fix them. Maybe we can add a warning to the pinned comment @sperg512
@sperg512 about clinfo
, both have similar output from what I can see, and both are open source. Arch upstream - AMD clinfo - I'm neutral on what to choose, but Arch one seems more active.
@apaz you can download it from here, then add it from Blender as a zip addon. I'm not sure if it worked with the previous AMDGPU-pro version, that's why I downloaded it. You will also have to configure it because the original configuration is fast but not so good. The manual is here
@coxackie can you also try the clinfo from AMD and see if it gives any meaningful errors? I've uploaded for you here
so, yeah. For 5700:
On one hand, opencl-mesa
does work; clinfo
gives:
fatal error: cannot open file '/usr/lib/clc/gfx1010-amdgcn-mesa-mesa3d.bc': No such file or directory
On the other hand, the latest version 20.45 of opencl-amd
also does not discover the card. I tried quite a lot, but I will just stick with 20.40 for the foreseeable future, as it seems to be the only one working. I am quite puzzled as to how 5700XT appears to have no problems. Is there some dependency I am missing?
Using a Radeon R9 Nano here, trying to load the scene crashes blender immediatly. I can make blender crash by going into options and clicking on the OpenCL tab inside System, this produces a logand more information, fortunately.
https://paste.debian.net/1173564/
I'll have more time later in the day so I hope I can dig deeper into the meaning of the crash
@luciddream Excuse the OT, but I wanted to ask you how you made RadeonPro Render work in Blender. Is there an installation guide? (I'm referring to the images you posted in the commentary 2020-11-19 14:16)
@Recompiler You can try to contact Bridgman on the Phoronix forum; he is one of the developers of ROCm and sometimes intervenes on the forum. The thread is as follows: https://www.phoronix.com/scan.php?page=article&item=amd-rx6800-opencl&num=1
An example of an answer from him: "Question: then for having opencl driver the 5000 amd wait a year, nice and to use this we need the closed driver? They already resolve the hangs in navi cards or Iwill need to wait another year?
Answer: We did the OpenCL-over-ROCR work for 5000 and 6000 series in parallel, although in the last couple of weeks before launch we prioritized the 6000 issues since the 5000 series already had OpenCL support via the PAL paths. AFAIK we managed to resolve nearly all of the 5000 series issues as well - I'm not sure if we are going to recommend cutting over to ROCR back end for Navi1x in the 20.45 release or wait until the subsequent release."
I think I did this right.
Downloaded from https://cloud.blender.org/p/gallery/5dd6d7044441651fa3decb56.
Opened in blender. First run I think used the CPU, and it rendered fine. Figured out how to set it to use the GPU/OpenCL...not so fine.
No coredump, but after seeing no advance on the timer, I escaped out of blender, and this was in the journal:
@merlock You pretty much just pacman -S blender
, download the scene, then open it. Should automatically open in Blender. If it doesn't try running blender <junk shop scene filename>.scene
and find any relevant output (segfaults, etc)
@sperg512 I'll work on installing blender and figure out how to render that scene (I have the artistic ability of a cinder block, so have never needed/used software like blender). I use this for folding@home, and have had no problems as far as that's concerned. But I'll be happy to attempt to help the cause. :)
@merlock People seemed to have trouble with Polaris GPUs; are you able to render the Junk Shop scene in Blender? (should be a few pages back)
I doubt this will have any added value, but here's my clinfo for an RX560.
Thanks man. I kid you not, I've been jacking with this freaking problem all day and you are the only person that's tried helping me out.
I'm going to try to dig into the system logs to see if I can find anything at all that might be related to it. I really want to get it fixed so I can render my models with OpenCL.
We don't actually know whether it's a driver issue or not, but I highly doubt it is because I've only heard that the ROC* stuff works perfectly, plus it works for me and the 5700XT owners. I know someone who ordered a 6800XT sometime ago, I'll ask him if he can try this out.
Any chance I could get in touch with a dev that works close to the drivers? Are there any open github repos for the driver? I honestly have no idea where or how to troubleshoot this.
Just tried on the standard linux kernel and it's the same story. I haven't noticed any difference.
That's because it's on 20.40 probably. But yeah, this makes absolutely no sense, because everything works perfectly for me, same CPU and all. COULD be that you're using a custom kernel, if you have the regular one installed try that out. other than that, I've got nothing, maybe @luciddream can chime in?
Surprisingly I get an output from clinfo with opencl-amdgpu-pro-pal installed.
I don't know why opencl-amd is causing clinfo to seg fault.
I wound up uninstalling opencl-amd temporarily to get clinfo working again.
AMD Radeon(TM) Vega 10 Graphics (RAVEN, DRM 3.39.0, 5.9.9-zen1-1-zen, LLVM 11.0.0)
CLInfo isn't even working? This might be an entirely different problem altogether. Do you have kernel 5.9 or newer? Try reinstalling opencl-mesa
, llvm-libs
, and ocl-icd
. Maybe blender and clinfo too. That's all I can think of
Okay sorry for taking a while to respond I had issues.
clinfo was working but now it's seg faulting:
[code]Segmentation fault (core dumped)[/code]
LibreOffice was working but I had to set these environment variables a while back to get it to work: SAL_DISABLE_OPENCL=1 and SAL_DISABLEGL=1, I just tried without those variables and it gives me this message:
(soffice:10009): Gtk-WARNING **: 19:07:06.512: Theme parsing error: gtk.css:2:33: Failed to import: Error opening file /home/recompiler/.config/gtk-3.0/window_decorations.css: No such file or directory Application Error
Resetting my env doesn't seem to fix it which is really weird.
And Blender is doing this every time I try to enable OpenCL:
Read prefs: /home/recompiler/.config/blender/2.92/config/userpref.blend [ALSOFT] (EE) Failed to set real-time priority for thread: Operation not permitted (1) [ALSOFT] (EE) Failed to set real-time priority for thread: Operation not permitted (1) LLVM triggered Diagnostic Handler: Illegal instruction detected: VOP instruction violates constant bus restriction renamable $vgpr4 = V_CNDMASK_B32_e32 32768, killed $vgpr5, implicit killed $vcc, implicit $exec LLVM failed to compile shader radeonsi: can't compile a main shader part LLVM triggered Diagnostic Handler: Illegal instruction detected: VOP instruction violates constant bus restriction renamable $vgpr4 = V_CNDMASK_B32_e32 32768, killed $vgpr4, implicit killed $vcc, implicit $exec LLVM failed to compile shader radeonsi: can't compile a main shader part Read blend: /home/recompiler/blender/donut-tut.blend Writing: /tmp/donut-tut.crash.txt Segmentation fault (core dumped)
Crash log:
blender(BLI_system_backtrace+0x34) [0x56180f286074] blender(+0xd9bded) [0x56180ce79ded] /usr/lib/libc.so.6(+0x3d6a0) [0x7f0b866006a0] /usr/lib/libamdocl64.so(+0xb12b4) [0x7f0af217a2b4] /usr/lib/libamdocl64.so(+0xb1858) [0x7f0af217a858] /usr/lib/libamdocl64.so(+0xb357e) [0x7f0af217c57e] /usr/lib/libamdocl64.so(+0xb41dc) [0x7f0af217d1dc] /usr/lib/libamdocl64.so(+0x7a923) [0x7f0af2143923] /usr/lib/libamdocl64.so(+0x83fce) [0x7f0af214cfce] /usr/lib/libamdocl64.so(+0x75395) [0x7f0af213e395] /usr/lib/libpthread.so.0(+0x1118f) [0x7f0b9105418f] /usr/lib/libamdocl64.so(clIcdGetPlatformIDsKHR+0xad) [0x7f0af213e4bd] /usr/lib/libOpenCL.so.1(+0x6131) [0x7f0b452b0131] /usr/lib/libOpenCL.so.1(clGetPlatformIDs+0xf5) [0x7f0b452b1a25] blender(_ZN3ccl18device_opencl_infoERNS_6vectorINS_10DeviceInfoENS_16GuardedAllocatorIS1_EEEE+0x49) [0x56180e260209] blender(_ZN3ccl6Device17available_devicesEj+0x2b9) [0x56180e23f569] blender(+0x20d2f59) [0x56180e1b0f59] /usr/lib/libpython3.8.so.1.0(PyCFunction_Call+0x19a) [0x7f0b88ba88fa] /usr/lib/libpython3.8.so.1.0(_PyObject_MakeTpCall+0x464) [0x7f0b88b9b4d4] /usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x50e8) [0x7f0b88b96da8] /usr/lib/libpython3.8.so.1.0(+0x13e046) [0x7f0b88bb2046] /usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4ca7) [0x7f0b88b96967] /usr/lib/libpython3.8.so.1.0(+0x13e046) [0x7f0b88bb2046] /usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4ca7) [0x7f0b88b96967] /usr/lib/libpython3.8.so.1.0(+0x13e046) [0x7f0b88bb2046] /usr/lib/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x4ca7) [0x7f0b88b96967] /usr/lib/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x108) [0x7f0b88ba2838] /usr/lib/libpython3.8.so.1.0(PyObject_Call+0x212) [0x7f0b88bb5ab2] blender(+0x160be1e) [0x56180d6e9e1e] blender(+0x15ad670) [0x56180d68b670] blender(+0x162e0f5) [0x56180d70c0f5] blender(ED_region_panels_layout_ex+0x506) [0x56180d70dcc6] blender(+0x2b2015a) [0x56180ebfe15a] blender(ED_region_do_layout+0x60) [0x56180d70cd50] blender(wm_draw_update+0x41f) [0x56180d1f577f] blender(WM_main+0x34) [0x56180d1f3734] blender(main+0x367) [0x56180ce4b8c7] /usr/lib/libc.so.6(__libc_start_main+0xf2) [0x7f0b865eb152] blender(_start+0x2e) [0x56180ce761ae]
File "/usr/share/blender/2.92/scripts/addons/cycles/properties.py", line 1589 in get_devices_for_type File "/usr/share/blender/2.92/scripts/addons/cycles/properties.py", line 1666 in draw_impl File "/usr/share/blender/2.92/scripts/startup/bl_ui/space_userpref.py", line 594 in draw_centered File "/usr/share/blender/2.92/scripts/startup/bl_ui/space_userpref.py", line 182 in draw
I tried putting them in code blocks but BB code is really crappy and doesn't do code blocks for some reason :/
That's extremely odd - I have the exact same setup (tho with a Vivobook instead) and the junk shop scene renders perfectly without issue (detailed earlier)
When you run clinfo
there should be a few lines mentioning "Device Name". Copy one of those and send it. For reference, mine is:
AMD Radeon(TM) Vega 10 Graphics (RAVEN, DRM 3.39.0, 5.9.9-arch1-1, LLVM 11.0.0)
Also, check for anything relevant when running blender from the command line, if any. Try running LibreOffice too, as I think it uses OpenCL.
COULD be some OEM shit as I've heard certain Lenovo laptops don't like working very well with Linux, plus it could just be a difference in the manufacturing dates or something
@sperg512
I got your message from the other aur package thread.
Almost right after I wrote that I switched to this package and unfortunatly I'm still getting the same problem, Blender is seg faulting any time I set my Vega 10 to be used with OpenCL in the renderer.
I can provide any information you need to try to narrow the issue down.
Specs: Lenovo Flex 14-API 81SS000BUS Ryzen 7 3700U Zen+ Radeon RX Vega 10 Picasso GCN 5 Arch
EDIT Also wanted to say that in my case the Vega 10 is a iGPU, It's on the same die as the CPU.
Well this is weird. I unarchived and then decompressed all the deb files and checked the libraries it produced. None of the remaining ones not already included here are related to OpenCL or ROC*.
However, @coxackie I might have an idea. The release notes say it requires kernel 5.9 - do you have that?
Also, I noticed there's a clinfo
-providing deb file in there, would it be a good idea to provide and conflict clinfo
? Thought it would be helpful since it's good for debugging CL issues.
If everyone's okay with it, I'll add it, but either way I'm also gonna add linux>=5.9
to the requires
list because that's what I believe to be the source of @coxackie.
In addition, the rocm-device-libs
doesnt seem to get its files copied over, I'll fix that too. (though I don't know where to actually put the files--doing /opt/amdgpu for now)
What I was thinking is that it maybe has its own version of LLVM specific for Radeons but I don't know. I'll be home in about 2-2.5 hours, so like I said I'll check the data for those deb files, to see if it has its own libraries
@sperg512 @luciddream as I mentioned, I will be happy to test combinations to see what could potentially work. I do have installed the llvm
and clang
arch packages - in fact, they are required by opencl-mesa
.
Most of these files are provided by other packages though. Specifically llvm-libs
and clang
- Even after I deleted clang
, clinfo continues to work.
There are 2 .deb files that were not extracted: llvm-amdgpu-pro-rocm_11.0-1164792_amd64.deb
and libllvm-amdgpu-pro-rocm_11.0-1164792_amd64.deb
(not counting the dev ones). Maybe those have necessary files as they are LLVM things. In a bit I'll make a PKGBUILD that uses those as well and see if it works.
Like @sperg512 said there are some extra files that I haven't used in my PKGBUILD. This is the complete list of the files in the lib directory on Ubuntu after a headless OpenCL installation.
AMDGPU:
libdrm_amdgpu.so.1 libdrm.so.2 libhsakmt.so.1 libkms.so.1 libdrm_amdgpu.so.1.0.0 libdrm.so.2.4.0 libhsakmt.so.1.0.6 libkms.so.1.0.0
AMDGPU-PRO:
libamd_comgr.so.1 libamdhip64.so.1 libhsa-runtime64.so.1 libOpenCL.so.1.2 libamd_comgr.so.1.7.0 libamdhip64.so.1.5.19245 libhsa-runtime64.so.1.2.0 libamdhip64.so libamdocl64.so libOpenCL.so.1
edit: tbh I don't see something missing. Maybe libOpenCL but this is provided by ocl-icd and it was working with 20.40
@sperg512 if there is any way I can help by testing, let me know...
@apaz well, the 5700 (no XT) seems to be unhappy with this change. Is there anything else that you guys have installed that may be helpful? I also have opencl-mesa
and all dependencies there (but it has other problems); but the card is not detected independently of whether I have opencl-mesa
installed.
@coxackie that's exactly what I feared. I think the ROCr stuff is just overall lighter than pal, but there were some other files that I might add if i can find any that might be necessary
@coxackie A lot has changed: they switched to the ROCr backend, which is part of ROCm. Until now amdgpu-pro and ROCm were incompatible. Strange though, with 5700XT it works without problems both for me and luciddream.
RX 5700 here. Latest update does not detect card (Number of devices 0
), no OpenCL. Had to revert back to previous version. What happened - seems like the package lost a lot of weight upon update. Maybe something was missed?
Blender works for me too (5700xt), but if it's on the release notes must be for a reason. I made some rendering tests here: https://imgur.com/a/YzyQElw
I tried to find rocm stuff to test but I couldn't. I think Tensorflow-rocm for example requires the HIP compiler, which the runtime does not include. When I try to run it I get: ImportError: libhipsparse.so.0: cannot open shared object file: No such file or directory
Maybe I will try Hashcat which has different code for rocm and amdgpu, and compare it with previous results.
Also @agapito make sure you've got LLVM and its libraries installed, though you should (as it's a dependency of vulkan-radeon
, clang
and mesa
which are basically required to do any 3d rendering). Maybe you also need opencl-mesa
. If you have both of those/you install them and it still doesn't work, then I might need to also add the LLVM libs deb file.
Interesting. For me, it opens and renders fine, though it is a bit laggy. That might be because it's using up all my Vega's VRAM and it probably needs to borrow some strength from the weaker dGPU. It also uses anywhere from 1.8-2.1 GB of RAM according to htop
, and my Vega uses that same amount of RAM as its VRAM (although it's not visible to programs). I might test some ROCm stuff later if possible. No errors in CLinfo, etc. either so I think it's safe to say this works.
No problem, by the way, luciddream.
Hello, thanks sperg512 for the co-maintain. I'll take another look later in the day but from the AMD release notes, Blender is a known issue with the driver.
Blender 2.90.1 users may experience page faults (details in dmesg log).
My experience is that it's better to keep the new driver than revert the package. Users that need blender can still use the previous PKGBUILD to install an older driver.
Installed and everything is OK: clinfo reports no errors; Junk shop scene in Blender is OK. I have AMD RX 5700XT. Thanks!
@kode54 I downloaded that scene for testing purposes, reinstalled 20.40 driver (I had a backup on my HDD) and GPU tile never finishes his work. My card is 4GB only so that's the reason probably, but Blender doesn't crash and GPU works fine on less complex scenes like fishy cat or racing car. With 20.45 every time i try to load a "gpu project" Blenders crashes.
@agapito Interesting experience, considering I never could get anything newer than 19.50 to work with the complex Blender scene I used to use to verify everything is in order. Specifically, I'd be using the Junk Shop scene, which was the splash screen for 2.81, available from Blender's site full of demo files for all their splash screens. It rolls in at roughly a 400MB download, and probably needs an 8GB or larger card and almost as much system memory free as well.
I also know that the open source ROCm OpenCL driver never worked for me with Blender. Crash every time.
Blender crashes using a Polaris card. It was fine using 20.40 driver.
[ALSOFT] (EE) Failed to set real-time priority for thread: Operation not permitted (1) /run/user/1000/gvfs/ non-existent directory LLVM triggered Diagnostic Handler: Illegal instruction detected: VOP instruction violates constant bus restriction renamable $vgpr4 = V_CNDMASK_B32_e32 32768, killed $vgpr5, implicit killed $vcc, implicit $exec LLVM failed to compile shader radeonsi: can't compile a main shader part LLVM triggered Diagnostic Handler: Illegal instruction detected: VOP instruction violates constant bus restriction renamable $vgpr2 = V_CNDMASK_B32_e32 32768, killed $vgpr2, implicit killed $vcc, implicit $exec LLVM failed to compile shader radeonsi: can't compile a main shader part
It seems to install properly, I'll test some OpenCL stuff tomorrow night. Thank you very much, appreciate it!
It works fine for me but I'm super tired and can't check if I've done anything wrong. I've uploaded the PKGBUILD here for now.
Judging by the OpenCL tests on Phoronix, it seems that ROCm now works on the RX 5700 XT as well. Performance is lower than RDNA2 but no problems are reported: https://www.phoronix.com/scan.php?page=article&item=amd-rx6800-opencl&num=1
Well Rocm isn't supposed to work with 5700 XT yet. But some people say it's working for them. I will take a better look in a couple of hours unless you manage to make it work first.
I've compiled a list with the new files and the deleted ones, if it helps anyone
NEW:
amdgpu-pro-rocr-opencl_20.45-1164792_amd64.deb comgr-amdgpu-pro_1.7.0-1164792_amd64.deb comgr-amdgpu-pro-dev_1.7.0-1164792_amd64.deb hip-rocr-amdgpu-pro_20.45-1164792_amd64.deb hsakmt-roct-amdgpu_1.0.9-1164792_amd64.deb hsakmt-roct-amdgpu-dev_1.0.9-1164792_amd64.deb hsa-runtime-rocr-amdgpu_1.2.0-1164792_amd64.deb hsa-runtime-rocr-amdgpu-dev_1.2.0-1164792_amd64.deb libllvm-amdgpu-pro-rocm_11.0-1164792_amd64.deb llvm-amdgpu-pro-rocm_11.0-1164792_amd64.deb llvm-amdgpu-pro-rocm-dev_11.0-1164792_amd64.deb opencl-rocr-amdgpu-pro_20.45-1164792_amd64.deb opencl-rocr-amdgpu-pro-dev_20.45-1164792_amd64.deb rocm-device-libs-amdgpu-pro_1.0.0-1164792_amd64.deb
DELETED:
opencl-amdgpu-pro_20.40-1147286_amd64.deb opencl-amdgpu-pro-comgr_20.40-1147286_amd64.deb opencl-amdgpu-pro-dev_20.40-1147286_amd64.deb opencl-amdgpu-pro-icd_20.40-1147286_amd64.deb hip-amdgpu-pro_20.40-1147286_amd64.deb
I think the ROC* binaries are for Vega 10 and up, so I think we could both verify them as, well, it's the oldest one supported by it(mine) and one of the newest ones (yours) so it should probably be enough to test. Other than that though, if they didn't work properly with RDNA2 there's probably be a bunch of posts on Reddit about it. So far, I have yet to see any, and in fact, I've only seen people saying that the ROCm support is nearly perfect.
I assumed you meant like testing if they work but if you meant something else then tell me
Another issue is that if the rocm binaries are now included with the archive, maybe the opencl-amd package should also take that under consideration, and not only work with the old files. I guess only someone with a 6800XT could test it properly, but maybe there is a way to verify the rocm binaries with my 5700XT.
alright, I believe I found the correct files and changed them accordingly, but oddly when it installs, the package is almost 100 MiB smaller than before. It could be they just made the drivers take up considerably less space but i'm not sure. I'll check and compare file sizes to the 20.40 version and see if they actually add up properly. I'll also post my PKGBUILD so you can test it before I push the new one to master.
I'm experimenting with the new drivers on my PC, I will post the PKGBUILD somewhere if I manage to make it work - although my time is limited tonight.
The download is now completely different doesn't work with just changing the versions. It seems many files had their name changed, with things like "rocr", without any "pal". The driver installation page also seems to have not been updated. I don't know how to install this properly (don't have a Debian system to see what files it uses) so does anyone know what I actually need to do? I think it could be that the "rocr" drivers are the same as pal but I've got no clue.
Link is https://drivers.amd.com/drivers/linux/amdgpu-pro-20.45-1164792-ubuntu-20.04.tar.xz
Correction because my IQ is negative: it's only reporting the Vega as an APU, not a GPU, because of the dedicated Picasso GPUs that I believe were in the most recent Vega chipsets. It's seeing the Vega as part of the CPU, not as it's own GPU unlike with Intel's integrated graphics. I wonder why it doesn't report the dGPU tho, because Picasso for me uses the amdgpu drivers... I don't know. AMD drivers are weird.
Edit as of 20.45: it now properly reports it, in my case as AMD Radeon(TM) Vega 10 Graphics
, confirming this was indeed an issue with AMD's blobs. OpenCL shit still works perfectly too.
OpenCL code works perfectly and I didn't actually have any issues with it. So I think it's just the driver isn't properly reporting the Vega as a GPU.
I mean, I've written lots of code on my phone, mostly python and some bash, but I'm not even gonna try writing C++ or C lol. But yeah, once I'm home (in 4 or 5 hours from now), I'll test it out and post my results here.
I will say, that I won't be able to write any super advanced opencl, but it should be good enough to test...
Hey..... not writing any code on your phone???? Whats wrong with you? LOL
Termlinux app and ed as perfect editor and on you go!
Thanks for your time. I appreciate that!
I don't. I'm not too sure what's going on. And blender seems to eat RAM up normally, even on my other PCs which use old Intel integrated GPUs.
I may look into it a bit later when I'm home. Not gonna write any OpenCL code on my phone lmao
I feared that...
And as a blob from amd, it's theirs to fix i guess...
Do you have a working configuration (OpenCL-related) with the rocm-driver and the buildin APU?
If so, can you give some hints about it?
Ah, I see what you mean. I'm not sure how to fix, it looks like the same thing's happening to me. It might be a problem with the driver itself.
I think I had to be more specific: amd-opencl works so far fine, except that some blender projects have a much to high RAM hunger.. I'm talking about DEVICE BOARD NAME of clinfo, which identify the "card".. would great to see something like "builtin Vega 8 APU/GPU" instead.
[~]$ clinfo
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3143.9)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Platform Host timer resolution 1ns
Platform Extensions function suffix AMD
Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx902
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0 AMD-APP (3143.9)
Driver Version 3143.9 (PAL,HSAIL)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) Unknown AMD GPU
Device Topology (AMD) PCI-E, 03:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 8
Or are you referring to my OpenCL Rocm-issues?
@jiweigert To me that sounds like a hardware problem... Make sure it works correctly for intensive tasks (such as Vulkan rendering) and make sure you've got the right drivers, including AMDVLK(/vulkan-radeon) and AMDGPU. If that doesn't work then you've probably got a hardware issue, because this works perfectly with my 3700U and its corresponding Vega APU.
Hi,
Is there any way that mybuiltin Vega 8 APU can be correctly identiified? My Vega 8 APU (fx902) in Ryzen 5 3500U is recognized as "Unknown AMD GPU".
Is there's a way to patch new hardware id's or a different detection into the openCL-source?
I tried also with the rocm-openCL driver, which correctly identyfy the APU, but unfortunately, the OpenCL-Stack is somehow broken and crashes my System, so I'm back on this package.
If you need any additional Information about hw-ID's etc, I'm happy to provide Info's from my system.
Kind regards.
Jörn-Ingo Weigert
I don't think even the initial release of the Radeon Software for Linux supported anything older than GCN2. You'll need to downgrade your OS to Ubuntu 14.04 and install the Crimson software, or just live with the Mesa OpenCL driver.
Then again, 20.30 claims to support GCN4, but that's a near total failure. So I'm still using 19.50 myself. Anything newer, I just blindly release to appease the whims of anyone who would report the package as outdated, without actually having the means to test whether it's fully functional on anything recent.
Does 20.30 still support GCN1 or do I need to downgrade to 19.10?
@ArthurBorsboom works for me. I am using yay btw.
==> Making package: opencl-amd 20.30.1109583-1 (Tue 11 Aug 2020 11:32:56) ==> Retrieving sources... -> Found amdgpu-pro-20.30-1109583-ubuntu-20.04.tar.xz ==> Validating source files with sha256sums... amdgpu-pro-20.30-1109583-ubuntu-20.04.tar.xz ... FAILED ==> ERROR: One or more files did not pass the validity check! error downloading sources: opencl-amd
Its working again I just checked.
No, they haven't. I have no idea what you're talking about.
FYI, you're supposed to let makepkg
download it, because it needs to supply a forged referrer to download it.
AMD has changed the download link so it is unable to download and compile from the link in the AUR.
@mirh thank you very much for your comment
GCN1 still works just fine. If any, many programs require GPU_FORCE_64BIT_PTR=1 to work properly.
Fixed the issue with open cl not working on my system even with opencl-mesa. Darktable and Libreoffice works with Open CL enabled.
@kode54 Yes it is but Davinci resolve didn't work with opencl-mesa
and I edited your pkgbuild to build version 19.20 :)
but it produce errors on darktable cltest
@Ahmedtas Something that old may well be supported by Mesa’s OpenCL driver.
@kode54 I don't know if this is something obvious but I have found that AMD have removed HD 7000,8000 from their compatibility list since 19.20 https://www.amd.com/en/support/kb/release-notes/rn-rad-lin-19-20-unified https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-20-20
Sorry for being rude, everybody. This should be fixed now. I didn’t notice the obvious 30MB reduction in package size, or lack of working PAL driver, since I don’t have the respective hardware.
@kode54 thanks for the answer! I'll try to find out what i've installed already and then check again. Thing is I can download this package, work, then uninstall it and then reboot works ok again. Maybe the ugliest workaround ever.
Best
@kode54 I add under "mv "${srcdir}/opencl/${shared}/libamd_comgr.so" "${pkgdir}/usr/lib/"" this line
mv "${srcdir}/opencl/${shared}/libamd_comgr.so.1.6.0" "${pkgdir}/usr/lib/"
to the PKGBUILD.
and it works under Manjaro.
@luciddream Thanks for the hint!
@kode54 I edited my other comment. I think the reason it's not working is that libamd_comgr.so in opencl-amdgpu-pro-comgr_2020-*.deb has been renamed to libamd_comgr.so.1.6.0 - and your PKGBUILD is copying the symlink only. Copying the right file and creating a symlink for it in the package should fix it.
Hi, I don't use Manjaro, or package for Manjaro. I'm going to place the blame squarely on Manjaro for this one.
I would also like to add that this update broke openCL detection in blender and clinfo for my 5700xt. And I too am on manjaro. 20.10 works fine tho.
Doesn't work with 5700 XT. CLinfo could not see the card. (Number of devices 0). Blender does not find the card either. My System: Manjaro
I'm back to version 20.10, clinfo and blender find my card again.
@kode54 Thanks for the quick update. I'm on Windows now so I can't test it, but I will do in a while. One thing I noticed when I built it is that libamd_comgr.so had been renamed to libamd_comgr.so.1.6.0 - I had to create a symbolink link to make it work with my GPU, or else clinfo would not see it.
Blender also worked for me, but I only tried the bmw scene. What scene do you have issues with?
Update: It works on my Polaris GPU. But only for Luxmark. (from the official binaries) It doesn't work for that ridiculous Blender scene I used to get working with 19.50. Now it will appear to reach the render stage, then the Blender render window will become unresponsive, and the timer will freeze. top shows my CPU is doing something with it, and memory is being shuffled around. radeontop shows my GPU is working hard. Again, Blender unresponsive.
Cool, this thing now just maxes out my system memory and doesn't appear to do anything. I guess I'll just blindly push an update that won't work for anyone using a Polaris GPU, since this is literally the only option that works for Navi owners.
I won't be fielding any bug reports. Those should go entirely to the upstream bug tracker.
I'll be sticking to the rocm-opencl-runtime from now on for my own personal use, as while that uses a huge amount of system memory, at least it actually seems to be functional on Polaris hardware.
If anyone wants to buy me a Navi GPU to test on, I'll be happy to take your money.
I've manually built 20.20 and now Geekbench is running again with similar numbers to 19.50. Luxmark v4 works as well. (5700 XT)
@451farenheit It sounds as if you actually installed the full amdgpu-pro-installer package set instead, which this package only claims compatibility with, not a total replacement for.
That package has a libGL which will prevent your system from reaching a login manager or desktop in most cases.
It is also entirely possible that you have an OpenCL-using service that loads on boot and doesn't allow your machine to reach a desktop. In which case, there's probably no hope for a working OpenCL on your system until the upstream AMDGPU Pro package is updated to address the issue.
Please see the Freedesktop.org GitLab issue tracker for AMDGPU Pro, there may be something relevant there.
Hi there!
Excuse me if this is not the right place to post this. my system freezes after the "/dev/sdax XXXXX/xxxx files, xxxx/xxxx blocks" line if this package is installed. Everything works fine without it but I need this package (or so I think) to use my gpu in blender. System: cpu=15-5675c gpu=RX 5500XT 8GB ram Arch up to date
I have seen many posts talking about freeze while boot, but none relates to this package or I have not found any. While frozen I can switch to a command tty, so I think is something drivers related which brokes the graphical environment but I have no idea what can be incompatible or conflicting.
Any help is very appreciated
Edit: could opencl-mesa conflict with opencl-amd?
@luciddream: sadly, I cannot recommend you try rocm-opencl-runtime, as it still does not support Navi. Best I can tell you is to report the issue on the freedesktop.org gitlab, under the drm section for amdgpu.
Hey, I updated today and tried Geekbench but it crashes my PC (5700 xt). I'm also using latest mesa 20.0.6-2. Luxmark worked fine though.
@kode54 I not only ran that, but I also ran the entire Phoronix OpenCL Test Suite. Everything works fine. It seems it might be related to lack of free RAM in your system indeed (as you mentioned in your most recent comment), since I had less than 1 GB of RAM allocated to anything during my tests, leaving about 15 GB of RAM to tests, including the Blender render.
Package updated, and I have a feeling that it has the same problem that rocm-opencl-runtime just had on my current setup: Insufficient system RAM for the huge Blender project I attempted to render. With the rocm-opencl-runtime, that scene uses over 5GB of RAM to render, and that's not counting GPU RAM. It rendered in just over 7 minutes, much slower than when I had my full 16GB dedicated to a bare metal installation. I'm going to say this will probably improve with my future plan to upgrade to 32GB of RAM, where I will have a whole 24GB dedicated to my desktop VMs instead of the current 10GB.
I will update the package, if only because I am attempting to switch to rocm-opencl-runtime instead.
@kescherAUR Please also try the latest Blender, and for a test, try rendering the scene from this demo file:
https://cloud.blender.org/p/gallery/5dd6d7044441651fa3decb56
My Gnome session locks up the moment it reaches the part about compiling the kernels and uploading them, before it manages to display any rendering work. I have to switch to a TTY and killall blender
, after which the Gnome session starts to respond again.
Package built with patch in last comment of mine applied works fine on my machine (using an RX 480). No lockups or anything.
@kode54: I'm not sure how it is, but if official drivers are updated rarely, then it would be smart to check rocm version first, because it's in constant development over git(talking about fix for pre-vega cards)
While this discussion is ongoing, if anyone wants to update their opencl drivers with this package, here's a git diff.
diff --git a/PKGBUILD b/PKGBUILD
index ee14d4b..09cc0f2 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -5,8 +5,8 @@
pkgname=opencl-amd
pkgdesc="OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
-pkgver=19.50.967956
-pkgrel=5
+pkgver=20.10.1048554
+pkgrel=1
arch=('x86_64')
url='http://www.amd.com'
license=('custom:AMD')
@@ -19,13 +19,13 @@ DLAGENTS='https::/usr/bin/wget --referer https://support.amd.com/en-us/kb-articl
prefix='amdgpu-pro-'
postfix='-ubuntu-18.04'
-major='19.50'
-minor='967956'
-amdver='2.4.99'
+major='20.10'
+minor='1048554'
+amdver='2.4.100'
shared="opt/amdgpu-pro/lib/x86_64-linux-gnu"
-source=("https://drivers.amd.com/drivers/linux/${major}/${prefix}${major}-${minor}${postfix}.tar.xz")
-sha256sums=('d8bb480c72b4225ad864c60335d33254ce7d442590e8dd9c05659cc868b7be2f')
+source=("https://drivers.amd.com/drivers/linux/${prefix}${major}-${minor}${postfix}.tar.xz")
+sha256sums=('7cbd666f9dd3e25a7bd8332a2693cabae2c9b05afe00d286ef7120f38d0335f4')
pkgver() {
echo "${major}.${minor}"
@pix3l good that they’re reading my comments
Regarding the provided, this package provides the same files, but installs them differently. This package installs them to /usr/lib, while the amdgpu-pro-installer package installs them to /opt/amdgpu-pro/lib/* and sets system loader path configuration files to search there before the common system locations.
@kode54: https://aur.archlinux.org/cgit/aur.git/commit/?h=hsa-rocr&id=b68e99b242b64917bc17673532a981e56dc2a219 so they are reading your comments probably. And about splitting package, it's up to maintainer of the package. It's done to save save space on HDD, because usually user needs only one of them.
@BS86: Have you changed anything than version/checksums? Ive got VII, but the machine is disassembled and M/B need solder work, so I cannot test myself for now(but I'm following discussion,and trying to prepare system for all usefull tasks ;-)
BTW, do I understand right that by installingopencl-amdgpu-pro-comgr=19.50.967956, opencl-amdgpu-pro-orca=19.50.967956, opencl-amdgpu-pro-pal=19.50.967956 I will get identical binaries installed?If so, what's the purpose/benefits of this package? Does it mean that only real alternative to install is rocm-opencl-runtime?
@BS86 As have I, and I don't have a Vega or newer GPU to test the pal
driver. I only have an RX 480, and the orca
driver is broken there. Should this package be split into two packages, like amdgpu-pro-installer
does?
Last I checked, the only major reason for the upgrade was to support the 5600 series Navi cards.
@kode54 I have modified your PKGBUILD to use the 20.10.1048554 release, and at least with my Vega64, I see no issues. Desktop boots fine and foldingathome from the AUR also detects the OpenCL card and runs on the GPU.
I’ve tested rocm-opencl-runtime
, and the only takeaway I’ve found is that it’s completely non functional. It will emit an error about the host having run out of memory, every time. It’s also apparently being built with debugging enabled, hence it also emits warnings and errors to the console while it attempts to work.
I wonder how it related to: https://aur.archlinux.org/packages/rocm-opencl-runtime/
I've got one machine with Radeon VII and others are Intel based laptops. I try to keep one image on all machines, so up to now I've used opencl-amd, because after installing opencl-amdgpu-pro- I got crashes on Intel machine But now with rocm-opencl-runtime I understand I've got 3rd alternative, but not sure hwo they compares (I know that amdgpu-pro- are oficial and opencl-amd provides only OpenCL from amggpru-pro), so I'm more interested in comparison of rocm-opencl-runtime vs opencl-amd, what benefits it gives and what problems it causes(especially when it's installed,but used on Intel cards)
Package verified, 20.10 still causes GPU faults on RX 480.
Reported upstream: https://gitlab.freedesktop.org/drm/amd/-/issues/1101
I will not release a 20.10 package, because it is an unstable beta, and the only opencl driver which works on my machine, Orca, crashes outright if used from the 20.10 package.
E: I’ll investigate the 20.10 release that hit yesterday, but I’m not expecting a miracle.
@trougnouf a Thanks, I’ll add that package to the conflicts list, since these two packages really shouldn’t be used at the same time anyway. (Also good luck even getting that one running, I find that it just results in OpenCL device out of memory errors)
This conflicts with rocm-opencl-runtime (both provide /etc/OpenCL/vendors/amdocl64.icd)
I rescinded my deletion request. I have resumed my role in updating this package. My first order of business was declaring the conflicts and substitutes for the relevant sub packages of amdgpu-pro-installer
, which does include a libgl package, and from my own testing, that particular sub package breaks my system. Also, the parts of that package may conflict with this package, especially if they are sourced from different versions of the AMDGPU Pro package than this one is.
This package works fine with mesa, and the other packages seem a bit outdated? please don't delete this.
I don't see any libGL files get created. What libgl packages are you referencing @kode54? Thanks for the help so far!
Technically, this package is redundant compared to the relevant amdgpu-pro-installer
submodules, and their packages have separate options for either orca or pal OpenCL libraries, as one is for pre-Vega devices, and the other is for Vega and newer.
Again, this package shouldn't be a problem when installed alone, but I'd really recommend using their packages, and for now, avoiding the libgl submodules, as they seem to break desktop sessions, including preventing the display manager from starting again after installing. Easily fixed from a TTY by removing the libgl packages and restarting the display manager.
Works great for me (both opencl and X are working). Maybe just marking 'amdgpu-pro-installer' as conflicting package would be nice. GPU is 5600XT. Let me know if I can help with some debug.
As this is a bug beyond my control as a simple packager, I’ll just submit a request for deletion.
I have the same problem. It works fine but after a reboot X fails to start.
Log file: https://paste.rs/BcU
I'm finding the same thing. X fails to start for me also.
This works great for folding@home until I want to reboot. Then X fails to start for me. I use amdgpu otherwise. Once I uninstall this package then X will start. Anyway to prevent this from happening?
@ipha Thanks for that.
The pal driver for vega and newer needs an extra lib from opencl-amdgpu-pro-comgr.
Here's an updated pkgbuild: https://gist.github.com/ipha/5ad44023b1fc943bf83a46d256d9a371
Criminy cripesakes. I'll just fix that. I thought I missed something. Does that warrant a new patch version, if the old version doesn't even download properly?
Downloading the driver is broken for me.
The correct URL is: https://drivers.amd.com/drivers/linux/19.50/amdgpu-pro-19.50-967956-ubuntu-18.04.tar.xz
But source URL is: source=("https://drivers.amd.com/drivers/linux//${prefix}${major}-${minor}${postfix}.tar.xz")
Note the two backslashes (//), it should contain 19.50 but it doesn't. This package needs to be fixed. It misses having "${major}" inside the //.
The working version is: source=("https://drivers.amd.com/drivers/linux/${major}/${prefix}${major}-${minor}${postfix}.tar.xz")
Please fix it. Thank you for maintaining the package.
Edit: It is now fixed. Thanks.
Ah. The release notes for the 19.50 drivers do list RX 5500 and 5700 cards, maybe they don't support the 5600 yet? Or the mobile variants, for that matter.
I'll push an updated PKGBUILD momentarily.
I think it's my Navi 10 card that's still not well supported. From the "clinfo" I posted you can see that it does not recognize any openCL device. The only message I can get is:
"No compatible path tracking GPUs found. Cycles are rendered on the CPU"
I think I'll have to wait for Ubuntu LTS to be released and then the amdgpu-pro driver adaptation by AMD and then in AUR, before getting compatibility with my card. Thank you for the link.
It works with Blender here. Are you sure you have a whitelisted GPU? I have an RX 480 8GB, specifically the Asus ROG Strix O8G. Blender whitelists GPUs and OpenCL drivers that they know will work properly, and blacklist all the rest. You need to start it with an environment variable to make it ignore the blacklist, at the peril of causing it to possibly crash, or even crash your GPU.
Refinement of the PKGBUILD patch here. I decided to make the AMD GPU driver version a variable, so it can easily be edited when changing the major/minor version numbers as well.
https://gist.github.com/kode54/3038ddc986ea20042daf998c69dd979e
(I'm a newbie) I tried to modify the PKGBUILD locally to use the recent OpenCL 19.50 from amdgpu-pro. Clinfo reports:
If you want to try it, the modified pkgbuild is at:
Anyway, Blender still doesn't recognize the opencl. So for me it's useless.
@Nisc3d He is using the proprietary binaries from AMD (read carefully and you see he build and installed amdgpu-pro-installer, and not this package)
I am also unable to get OpenCL working with Navi (5700 XT) by only installing this package. @cruncher1 Can you share exactly what you did?
I'm disowning this package, as I currently don't have the time to maintain it. I'm sure someone else could adopt it and do a better job than I recently did
Updated to 19.30.934563
# Maintainer: grmat <grmat@sub.red>
pkgname=opencl-amd
pkgdesc="OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
pkgver=19.30.934563
pkgrel=1
arch=('x86_64')
url='http://www.amd.com'
license=('custom:AMD')
makedepends=('wget')
depends=('libdrm' 'ocl-icd' 'gcc-libs')
conflicts=('amdgpocl')
provides=('opencl-driver')
DLAGENTS='https::/usr/bin/wget --referer https://support.amd.com/en-us/kb-articles/Pages/AMDGPU-PRO-Driver-for-Linux-Release-Notes.aspx -N %u'
prefix='amdgpu-pro-'
postfix='-ubuntu-18.04'
major='19.30'
minor='934563'
shared="opt/amdgpu-pro/lib/x86_64-linux-gnu"
source=("https://drivers.amd.com/drivers/linux/${prefix}${major}-${minor}${postfix}.tar.xz")
sha256sums=('b97f0e31a9ca01971b1855e8e191fa825d538f7941331d3b15bc46474dde50f6')
pkgver() {
echo "${major}.${minor}"
}
package() {
mkdir -p "${srcdir}/opencl"
cd "${srcdir}/opencl"
ar x "${srcdir}/${prefix}${major}-${minor}${postfix}/opencl-amdgpu-pro-icd_${major}-${minor}_amd64.deb"
tar xJf data.tar.xz
ar x "${srcdir}/${prefix}${major}-${minor}${postfix}/opencl-orca-amdgpu-pro-icd_${major}-${minor}_amd64.deb"
tar xJf data.tar.xz
cd ${shared}
sed -i "s|libdrm_amdgpu|libdrm_amdgpo|g" libamdocl-orca64.so
mkdir -p "${srcdir}/libdrm"
cd "${srcdir}/libdrm"
ar x "${srcdir}/${prefix}${major}-${minor}${postfix}/libdrm-amdgpu-amdgpu1_2.4.98-${minor}_amd64.deb"
tar xJf data.tar.xz
cd ${shared/amdgpu-pro/amdgpu}
rm "libdrm_amdgpu.so.1"
mv "libdrm_amdgpu.so.1.0.0" "libdrm_amdgpo.so.1.0.0"
ln -s "libdrm_amdgpo.so.1.0.0" "libdrm_amdgpo.so.1"
mv "${srcdir}/opencl/etc" "${pkgdir}/"
mkdir -p ${pkgdir}/usr/lib
mv "${srcdir}/opencl/${shared}/libamdocl64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/opencl/${shared}/libamdocl-orca64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/opencl/${shared}/libamdocl12cl64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/libdrm/${shared/amdgpu-pro/amdgpu}/libdrm_amdgpo.so.1.0.0" "${pkgdir}/usr/lib/"
mv "${srcdir}/libdrm/${shared/amdgpu-pro/amdgpu}/libdrm_amdgpo.so.1" "${pkgdir}/usr/lib/"
mkdir -p "${pkgdir}/opt/amdgpu/share/libdrm"
cd "${pkgdir}/opt/amdgpu/share/libdrm"
ln -s /usr/share/libdrm/amdgpu.ids amdgpu.ids
rm -r "${srcdir}/opencl"
rm -r "${srcdir}/libdrm"
}
I have created ubuntu version of this in bash: https://gist.github.com/tuxutku/79daa2edca131c1525a136b650cdbe0a
There is a software conflict between foldingathome and the opencl-amd program. When it is installed, the program no longer works.
I haven't had a chance to go through this latest driver. OpenCL is working on Navi in Arch after building the packages from amdgpu-pro-installer and only installing the OpenCL packages.
@cruncher1 Interesting, I don't know much about the internals at all, but I'm having a problem writing powerplay tables to my Vega. The changes show up in the pp tables (pp_od_clk_voltage), but not in amdgpu_pm_info. Not sure what I'm doing wrong.
@vanities This package builds, but, at least for me, Navi GPU (5700XT) is not recognized as an OpenCL device. Works fine for Radeon VII, Vega, and Fury. Tested with Arch kernel and mainline kernel. I will debug the package, and see what else was changed, other than the libdrm folder, in the Debian package. OpenCL working on 5700XT in Ubuntu LTS using the AMD supplied driver.
Is there any reason we can't use 19.30? I have a patch ready for the PKGBUILD if you want to use it @grmat
Now 19.30-855429 is available for all other gpus and for other distributions. https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux
@G3Xg2xV6A 19.30-838629 was not removed, it was released with this url: https://www.amd.com/ru/support/kb/release-notes/rn-amdgpu-unified-navi-linux
I've been having segmentation faults as well with 19.30. Seems the 19.30 package was removed from the amd website.
Following changes for 19.20 at the right places in the PKGBUILD, maybe works. Didn't work for me. Built the package, but segfault.
pkgver='19.20.812932'
major='19.20'
minor='812932'
sha256sums=('c5376760ce15454c5ef5cef86571f3806114403d91b8a210629d2e927c98d852')
And a small suggestion: add a "-p" to all mkdir commands. This prevents from fail: directory exists when rebuilding a package.
@grmat, ok, thanks for explanation. But I think you may remove it now, as it is a bit confusing and anyway it was done several years ago, so no one affected person remaining I think. In readme there is no info that you said (about that you done it for co-existance of both versions), so I suggest you to place it there and also remove amdgpocl mentioning.
@Ashark: amdgpocl was the name I've been using for this package before this opencl-amd, which suits the arch package names better. The closed driver includes another version of the libdrm. The free one doesn't contain all the functionality and is more frequently updated. The renaming is a workaround to be able to use the free libdrm with the free stack alongside this one with opencl-amd. See readme for more info
@francoism90: if clover (mesa) works for you, I'd prefer that. E.g. It doesn't for cycles (blender). The two don't conflict as they both use icd
@grmat can you please explain why this package conflicts with amdgpocl? I cannot see any package providing that. And why do you rename libdrm_amdgpu to libdrm_amdgpo? Edit: I explored a git history of this package and saw that it was named amdgpocl previously. Then it needed to use replaces instead of conflicts array. I think it could be removed now. And amdgpO naming was done sinse initial commit, I still do not understand why.
Well, I haven't benchmarked, but it did offer actually-working-with-blender-benchmark, which I've since learned does very bad things, being pre-2.79 Blender.
Also, this OpenCL runtime causes DaVinci Resolve to crash on startup, I haven't tested if installing opencl-mesa fixes that yet, since I got so pissed at it crashing once again that I uninstalled it and deleted the download package.
Any benchmark available against opencl-mesa
? It doesn't seem to conflict either, does this package offer any features not provided with the open-source one?
Here is an updated PKGBUILD, with the current version included in the revision history for comparison:
https://gist.github.com/kode54/51ae68590ac4b2cd5c4ce85c0d71a3a3
@Olympus593 How do I append a null kernel to individual programs? I'm not sure what this means exactly.
@grmat same copy to host problem on SI and CI cards. The only known workaround is by appending a null kernel but it can only work for individual programs
I've updated to 18.50, but I don't own a CI card anymore, so please let me know if there are problems with older hardware (even better if you already know workarounds that could be included).
BTW, if you feel like the package updates are too slow and want to jump in as co-maintainer, just contact me.
Libdrm workaround is not needed anymore. I am using blender-git package and opencl-amd 18.50. I can render using CPU + GPU.
@Ashark Have you still using the libdrm
workaround with this package? I upgraded this package using your PKGBUILD but both SVP4 and Blender was crashing after the upgrade. So, I tried to launch without the libdrm
workaround (LD_PRELOAD=/usr/lib/libdrm_amdgpo.so.1.0.0
) and both apps started to work. You might want to try this if was working before.
@Ashark AMD had not yet confirmed if OpenCL is fixed on SI cards. They used to have a list of supported APIs on their release notes but since 18.40, it does not show up anymore
It is confirmed. SI and CIK cards are having copy to host issues. This is most likely a problem with AMD as only OpenCL is suffering from problems when using experimental SI and CIK support.
Update: AMD's OpenCL implementation is crippled on GCN 1 cards as of v18.40. Guess GCN 1 owners will have to wait for AMD to fix the issue or revert back to Catalyst.
Here's my patch to get yourself version 18.40
Let me know if anyone finds any issues upgrading
To solve the blender/libdrm related issues you can use:
LD_PRELOAD=/usr/lib/libdrm_amdgpo.so.1.0.0
I found a better solution here: https://bugs.archlinux.org/task/60061
Upgrade your system, compile libdrm and just save libdrm_amdgpu.so.1.0.0 library and preload it to Blender. It works fine here.
Does anyone know if the blender issue
amdgpu_device_initialize: amdgpu_query_info(ACCEL_WORKING) failed (-9)
when using openCL and cycles, currently I'm running a downgraded libdrm (version 2.4.93-1) to solve this; or workaround.
If this needs more work or investigation I'll happily help debug this, I think the problem is in AMD's upstream driver that opencl_amd uses. Please correct me if I'm wrong...
@linnaea only AMD can fix the issue but most likely they will never fully develop SI and CIK support.
@Olympus593 I just gave up and ended up using pci passthrough with VM running an older distro. I'm toying with cltorch here.
That crash is probably caused by the same problem causing OpenCL CTS to segfault. CTS segfaults randomly during async copy to host, and if it doesn't segfault, only the first several hundred bytes of the data can pass CTS verification so it's still worthless.
Feels like a timing issue to me, but there's not much anyone can do about that.
@linnaea I am able to run DaVinci Resolve with only one problem. When using power windows and GPU accelerated fusion, it crashes. Mostly because it does not even meet minimum standards when it comes to usability.
For the time being SI support is pretty much unusable for any purpose, it fails OpenCL Conformance Tests pretty spectacularly, failing 67 out of the 77 tests before segfaulting the test program during async copy test.
It even fails 9 out of the 11 basic fp/int math tests, and all of the type convertion tests.
And that's on CentOS 7, listed as supported by AMD.
Updated to 18.40 and latest linux and linux-firmware package and still very buggy SI support. Or maybe I have everything misconfigured.
@urbenlegend Does this opencl works if you hold back libdrm to ver 2.4.93
with the updated package?
EDIT:
Just tested the new PKGBUILD. Updated package works as long libdrm
is at ver 2.4.93
As long AMD supplies libdrm
at ver 2.4.92
, the last backward compatible version is upstream's libdrm 2.4.93
as hinted in the following line
ar x "${srcdir}/${prefix}${major}-${minor}-${targetos}/libdrm-amdgpu-amdgpu1_2.4.92-${minor}_amd64.deb"
Just tried modifying PKGBUILD to get the latest 18.40 release. Blender still crashes with this driver and the latest libdrm.
Here's the modified PKGBUILD if anyone is interested:
# Maintainer: grmat <grmat@sub.red>
pkgname=opencl-amd
pkgdesc="OpenCL userspace driver as provided in the amdgpu-pro driver stack. This package is intended to work along with the free amdgpu stack."
pkgver='18.40.676022'
pkgrel=1
arch=('x86_64')
url='http://www.amd.com'
license=('custom:AMD')
makedepends=('wget')
depends=('libdrm' 'ocl-icd')
conflicts=('amdgpocl')
provides=('opencl-driver')
DLAGENTS='https::/usr/bin/wget --referer https://support.amd.com/en-us/kb-articles/Pages/AMDGPU-PRO-Driver-for-Linux-Release-Notes.aspx -N %u'
prefix='amdgpu-pro-'
major='18.40'
minor='676022'
shared="opt/amdgpu-pro/lib/x86_64-linux-gnu"
targetos='ubuntu-18.04'
source=("https://drivers.amd.com/drivers/linux/${prefix}${major}-${minor}-${targetos}.tar.xz")
sha256sums=('4f71f0a70d68a2d1714902855d1a5e8ccc454b1065182f904cf0f93862cac97c')
pkgver() {
echo "${major}.${minor}"
}
package() {
mkdir "${srcdir}/opencl"
cd "${srcdir}/opencl"
ar x "${srcdir}/${prefix}${major}-${minor}-${targetos}/opencl-amdgpu-pro-icd_${major}-${minor}_amd64.deb"
tar xJf data.tar.xz
ar x "${srcdir}/${prefix}${major}-${minor}-${targetos}/opencl-orca-amdgpu-pro-icd_${major}-${minor}_amd64.deb"
tar xJf data.tar.xz
cd ${shared}
sed -i "s|libdrm_amdgpu|libdrm_amdgpo|g" libamdocl-orca64.so
mkdir "${srcdir}/libdrm"
cd "${srcdir}/libdrm"
ar x "${srcdir}/${prefix}${major}-${minor}-${targetos}/libdrm-amdgpu-amdgpu1_2.4.92-${minor}_amd64.deb"
tar xJf data.tar.xz
cd ${shared/amdgpu-pro/amdgpu}
rm "libdrm_amdgpu.so.1"
mv "libdrm_amdgpu.so.1.0.0" "libdrm_amdgpo.so.1.0.0"
ln -s "libdrm_amdgpo.so.1.0.0" "libdrm_amdgpo.so.1"
mv "${srcdir}/opencl/etc" "${pkgdir}/"
mkdir -p ${pkgdir}/usr/lib
mv "${srcdir}/opencl/${shared}/libamdocl64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/opencl/${shared}/libamdocl-orca64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/opencl/${shared}/libamdocl12cl64.so" "${pkgdir}/usr/lib/"
mv "${srcdir}/libdrm/${shared/amdgpu-pro/amdgpu}/libdrm_amdgpo.so.1.0.0" "${pkgdir}/usr/lib/"
mv "${srcdir}/libdrm/${shared/amdgpu-pro/amdgpu}/libdrm_amdgpo.so.1" "${pkgdir}/usr/lib/"
mkdir -p "${pkgdir}/opt/amdgpu/share/libdrm"
cd "${pkgdir}/opt/amdgpu/share/libdrm"
ln -s /usr/share/libdrm/amdgpu.ids amdgpu.ids
rm -r "${srcdir}/opencl"
rm -r "${srcdir}/libdrm"
}
update: Wow, it worked! at least up to par with the other setup. But it also fixed the "kernels in a row" bug! Blender OTOH still doesn't even start, segfaulting at trying to find out the device:
AL lib: (EE) UpdateDeviceParams: Failed to set 44100hz, got 48000hz instead
amdgpu_device_initialize: amdgpu_query_info(ACCEL_WORKING) failed (-9) Writing: /tmp/blender.crash.txt
Segmentation fault (core dumped)
-- Here's the backtrace summary, a series of function calls leading to libdrm_amdgpo.so (with an O?)
blender(BLI_system_backtrace+0x34) [0x55e1e8b59b84] blender(+0xb7b8b2) [0x55e1e80e58b2] /usr/lib/libc.so.6(+0x37e00) [0x7fce6db09e00] /usr/lib/libdrm_amdgpo.so.1(amdgpu_get_marketing_name+0xc) [0x7fce485acbdf] /usr/lib/libamdocl-orca64.so(+0x8d871e) [0x7fce4908b71e] [trimmed]
-- I was getting that -9 return code on other openCL stuff too (-36 was another, related to not finding functions), it "can't find the device", notably from clBLAS (NB, no 't' at the end, clblast seem to be more forgiving with our OpenCL 1.1 hardware), but even that seem to be working now, although some results from the example code are a bit weird, like "-nan" instead of a float or too many 0.0000 which suggest rounding problems. (perhaps other constants to set on .bashrc?)
As for Blender, the only suggestion I might have is to disable cycles on compilation (there was a release note back when they finally split the kernels that openCL would only work on 2.78c[1], perhaps still true to SI, only advanced stuff like Baking would not be available, but you WOULD be stuck on that version) and use BlendLuxCore instead, which is just 'out of the oven' with a reboot, re-coded from scratch. All you have to do is download a zip addon. (Arch should have a no-cycles version? is there a cli parameter perhaps?)
Another thing I noticed was that CUDA was mentioned on the installation but I don't think this should be interfering with openCL.
[1] https://en.blender.org/index.php/OpenCL
Edit: added reference
@ArnaudNux I get that message if the Radeon driver is loaded instead of AMDGPU. Have you passed the Kernel Parameter Flags, with your bootloader? (check out the article, linked on previous posts)
@Olympus593 Same problem here, same SI chipset. On a different distro, where I managed to run openCL kernels, clblast example code (cache.c) produces a VERY similar symptom, whereas issuing another kernel to the hardware BEFORE CLEARING (flushing?) the cache breaks the program. The only difference I see is that distro is using the mesa-ocl interface, instead of this "amdocl-orca64" (I tried changing to the amdocl64.icd here in Arch, but didn't work at all, we SHOULD be able to use different "vendor icds" though).
I'll try the recommended bashrc exports to see if Arch gets to at least that same place, past that CL_OUT_OF_HOST_MEMORY.
libdrm and lib32-libdrm to 2.4.93-1
[root@AMD fah]# ./FAHClient --configure 22:40:42:INFO(1):Read GPUs.txt amdgpu_device_initialize: DRM version is 2.50.0 but this driver is only compatible with 3.x.x. No protocol specified
@Nightbane112 It sort of works but very unstable. Obviously, every time I need an app to use OpenCL I must launch it from a terminal to make OpenCL "sort of work"
In blender, it takes 3 tries before it stats to render but it is ridiculously slow (like CPU is several times faster. If you try to hit render again, blender will crash.
On Resolve thumbnail previews are completely useless, playback and render does work, however when using power windows resolve crashes.
Still very buggy. I think it is because of the experimental SI support. I wish I didn't have to launch every app via terminal.
@Olympus593 From the looks of it, I don't think its a driver problem. A bit of searching online lead me to this issue on GitHub (https://github.com/hughperkins/tf-coriander/issues/74)
How about appending this to your bashrc
file and see if it changes anything?
#!/bin/bash
export GPU_FORCE_64BIT_PTR=1
export GPU_USE_SYNC_OBJECTS=1
export GPU_MAX_ALLOC_PERCENT=100
export GPU_SINGLE_ALLOC_PERCENT=100
export GPU_MAX_HEAP_SIZE=100
@Nightbane112 I am using the amdgpu kernel driver and I have already enabled experimental support (I already double checked to make sure that the driver does load) but every time I try to use OpenCL (for example Blender Cycles Render) it ends with Error -6 CL_OUT_OF_HOST_MEMORY. Same error with DaVinci Resolve.
@Olympus593 Are you using the amdgpu
kernel driver? You can try using the experimental support for your GPU series (https://wiki.archlinux.org/index.php/AMDGPU#Enable_Southern_Islands_.28SI.29_and_Sea_Islands_.28CIK.29_support)
From the official amdgpu-pro driver page (https://www.amd.com/en/support/kb/release-notes/rn-prorad-lin-18-30), I could see your GPU is supported but you might need to cross-check on the AMD forums.
Do I need to downgrade Xorg or do some configuration because mine always ends up with Error -6 in any app trying to use OpenCL. I have a GCN 1 card with support provided by the AMDGPU driver.
PSA : I remove my previous comment to avoid confusion as now that this package works with the current version (opencl-amd 18.3.641594
).
@stoostranger Haha, seems like this package finally works although still needs the libdrm
downgrade shenanigans. Thanks!
I tried downgrading without libdrm and lib32-libdrm downgrade but blender still segfault. then I downgrade libdrm to 2.4.93-1, then install opencl-amd 18.3.641594 again, then now Blender Circle renderer works perfectly. Hope this help
As some of you noticed 18.30.641594-1 does not seem to work with blender. Manual downgrading worked for me. Edit PKGBUILD to match the following values:
major='18.20'
minor='606296'
source=("https://www2.ati.com/drivers/linux/ubuntu/18.04/amdgpu-pro-18.20-606296.tar.xz")
sha256sums=('2a0716993e8efb1fadcb92d82e9328e344bdbc78769f5ff95298b82f49ff76f9')
also change amdgpu1_2.4.92
to amdgpu1_2.4.91
and as @Nightbane112 mentioned downgrade libdrm
and lib32-libdrm
to 2.4.93-1
I have same error too on Blender (portable, Manjaro), the only temporary solution I found is, in Blender folder run "blender-softwaregl" file, it uses software and is very slow depend on your system. then I can render with Opencl. I hope it gets sooner fixed.
@IMBJR I am getting roughly the same issue as you when this package is installed.
# Blender 2.79 (sub 0), Commit date: 2018-05-26 21:51, Hash 32432d91bbe
# backtrace
blender(BLI_system_backtrace+0x34) [0x55d95ecf13f4]
blender(+0xb7b562) [0x55d95e27d562]
/usr/lib/libc.so.6(+0x37e00) [0x7fd541b02e00]
/usr/lib/libdrm_amdgpo.so.1(amdgpu_get_marketing_name+0xc) [0x7fd4f8e5dbdf]
/usr/lib/libamdocl-orca64.so(+0x8d871e) [0x7fd4f993c71e]
/usr/lib/libamdocl-orca64.so(+0x8d8cbf) [0x7fd4f993ccbf]
/usr/lib/libamdocl-orca64.so(+0x8dbc71) [0x7fd4f993fc71]
/usr/lib/libamdocl-orca64.so(+0x8f5488) [0x7fd4f9959488]
/usr/lib/libamdocl-orca64.so(+0xc4bc7d) [0x7fd4f9cafc7d]
/usr/lib/libamdocl-orca64.so(+0x8cb799) [0x7fd4f992f799]
/usr/lib/libamdocl-orca64.so(+0x8cb80f) [0x7fd4f992f80f]
/usr/lib/libamdocl-orca64.so(+0x8cc597) [0x7fd4f9930597]
/usr/lib/libamdocl-orca64.so(+0xcf772e) [0x7fd4f9d5b72e]
/usr/lib/libamdocl-orca64.so(+0xcf8c9a) [0x7fd4f9d5cc9a]
/usr/lib/libamdocl-orca64.so(+0xcf8eb6) [0x7fd4f9d5ceb6]
/usr/lib/libamdocl-orca64.so(+0x8a9787) [0x7fd4f990d787]
/usr/lib/libamdocl-orca64.so(clIcdGetPlatformIDsKHR+0x8a) [0x7fd4f98f068a]
/usr/lib/libOpenCL.so(+0x5d1e) [0x7fd5010add1e]
/usr/lib/libOpenCL.so(clGetPlatformIDs+0x115) [0x7fd5010afc15]
blender(_ZN3ccl10OpenCLInfo17get_num_platformsEPjPi+0x1c) [0x55d95f2412dc]
blender(_ZN3ccl10OpenCLInfo13get_platformsEPNS_6vectorIP15_cl_platform_idNS_16GuardedAllocatorIS3_EEEEPi+0x39) [0x55d95f242389]
blender(_ZN3ccl10OpenCLInfo18get_usable_devicesEPNS_6vectorINS_20OpenCLPlatformDeviceENS_16GuardedAllocatorIS2_EEEEb+0x19b) [0x55d95f246fab]
blender(_ZN3ccl18device_opencl_infoERNS_6vectorINS_10DeviceInfoENS_16GuardedAllocatorIS1_EEEE+0x56) [0x55d95f233cd6]
blender(_ZN3ccl6Device17available_devicesEv+0xe2) [0x55d95f208852]
blender(+0x1a0216a) [0x55d95f10416a]
/usr/lib/libpython3.7m.so.1.0(_PyMethodDef_RawFastCallKeywords+0x148) [0x7fd548a42a88]
/usr/lib/libpython3.7m.so.1.0(_PyCFunction_FastCallKeywords+0x21) [0x7fd548a42d21]
/usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x522c) [0x7fd548aba89c]
/usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x11b) [0x7fd5489fbf3b]
blender(+0xfac373) [0x55d95e6ae373]
blender(RNA_property_enum_items_ex+0x61) [0x55d95eb5ea71]
blender(RNA_property_enum_items+0x14) [0x55d95eb5eb14]
blender(RNA_property_enum_value+0x33) [0x55d95eb5eeb3]
blender(+0xf9bf9a) [0x55d95e69df9a]
blender(+0xf9cc24) [0x55d95e69ec24]
/usr/lib/libpython3.7m.so.1.0(PyObject_SetAttr+0x88) [0x7fd548aaecf8]
/usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0xd7d) [0x7fd548ab63ed]
/usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x11b) [0x7fd5489fbf3b]
blender(bpy_app_generic_callback+0xd5) [0x55d95e6a99d5]
blender(BLI_callback_exec+0x2d) [0x55d95ecb360d]
blender(WM_init+0x28a) [0x55d95e28dfda]
blender(main+0x3d1) [0x55d95e264841]
/usr/lib/libc.so.6(__libc_start_main+0xf3) [0x7fd541aef223]
blender(_start+0x2e) [0x55d95e27a01e]
@IMBJR
Have you tried downgrading libdrm
& lib32-libdrm
from ver. 2.4.94
to ver. 2.4.93
? AFAIK, that's the only workaround I found that works on my system.
EDIT: Seems like the workaround only worked for the previous version of this package (amdgpu-pro-18.30-633530). The current version (18.30-641594) needs to be downgraded to the previous version in order for opencl apps to work again.
Version 18.30.641594-1 is making blender segault:
Sep 05 18:05:35 pc audit[28523]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 pid=28523 comm="blender" exe="/usr/bin/blender" sig=11 res=1 Sep 05 18:05:35 pc kernel: audit: type=1701 audit(1536167135.181:15): auid=1000 uid=1000 gid=1000 ses=2 pid=28523 comm="blender" exe="/usr/bin/blender" sig=11 res=1 Sep 05 18:05:35 pc systemd[1]: Started Process Core Dump (PID 28578/UID 0). Sep 05 18:05:36 pc systemd-coredump[28579]: Process 28523 (blender) of user 1000 dumped core.
Stack trace of thread 28523:
#0 0x00007f1eea241bdf amdgpu_get_marketing_name (libdrm_amdgpo.so.1)
#1 0x00007f1eead2071e n/a (libamdocl-orca64.so)
#2 0x00007f1eead20cbf n/a (libamdocl-orca64.so)
#3 0x00007f1eead23c71 n/a (libamdocl-orca64.so)
#4 0x00007f1eead3d488 n/a (libamdocl-orca64.so)
#5 0x00007f1eeb093c7d n/a (libamdocl-orca64.so)
#6 0x00007f1eead13799 n/a (libamdocl-orca64.so)
#7 0x00007f1eead1380f n/a (libamdocl-orca64.so)
#8 0x00007f1eead14597 n/a (libamdocl-orca64.so)
#9 0x00007f1eeb13f72e n/a (libamdocl-orca64.so)
#10 0x00007f1eeb140c9a n/a (libamdocl-orca64.so)
#11 0x00007f1eeb140eb6 n/a (libamdocl-orca64.so)
#12 0x00007f1eeacf1787 n/a (libamdocl-orca64.so)
#13 0x00007f1eeacd468a clIcdGetPlatformIDsKHR (libamdocl-orca64.so)
#14 0x00007f1ef3c23d1e n/a (libOpenCL.so)
#15 0x00007f1ef3c25c15 clGetPlatformIDs (libOpenCL.so)
#16 0x00005607619272fc _ZN3ccl10OpenCLInfo17get_num_platformsEPjPi (blender)
#17 0x63c4f0885a279521 n/a (n/a)
Lopo, I had similar problem with invalid sha256sums. Problem is how hostname www2.ati.com is resolved. If I use dns provided by my ISP: I get resolution to ip: 92.123.37.188 and I dont get valid tarball.
But if I use dns server 1.1.1.1 for example then it is resolved to: 23.4.251.103 and I will get correct tarball.
It looks like they don't have properly sync mirros (CDN). Hope it helps.
2018-06-24 17:09:12 (912 KB/s) - ‘amdgpu-pro-18.20-606296.tar.xz’ saved [167182056/167182056]
==> Validating source files with sha256sums... amdgpu-pro-18.20-606296.tar.xz ... FAILED ==> ERROR: One or more files did not pass the validity check!
updated, thanks.
I can't fix your downloads and that has been discussed multiple times. Please read older comments and make sure you have the correct referer set. AMD won't allow the download without it.
Again, if you claim your download was successful, just ignore the checksums.
I also have the problem that the tar.xz file from the website is empty.
==> Validating source files with sha256sums... amdgpu-pro-18.20-579836.tar.xz ... FAILED ==> ERROR: One or more files did not pass the validity check!
Got this issue when trying clinfo on an R9 390x. But with RX 560 it's fine. amdgpu_device_initialize: DRM version is 2.50.0 but this driver is only compatible with 3.x.x.
Not sure how to up that DRM, anybody?
PAL-based driver also works fine here with RX 560, great work as always. :)
Thanks for your test and comment. Vega users will be glad to hear that. I've just updated the package here as well to include the PAL-based driver.
I tested your PAL+ORCA branch, and it works well with my Vega 64!
I did some experimenting on kernel 4.16 and found out that the PAL OpenCL drivers work fine without the extra step of installing libdrm.
I made a fork of your GitHub project and created a minimal PAL-only branch in it as a proof of concept.
@eggz: thanks for your comment. Every recent Linux graphics driver uses DRM, that also includes the upstream amdgpu driver.
I have pushed two newer versions to GitHub. One is the updated 18.20 PKGBUILD (in branch 18.20) and the other includes the newer PAL-based OpenCL driver (in branch PAL+ORCA), which I have never tested because I don't have recent hardware. Are you willing to try this one out?
I installed V18.10 on 2 systems so far and they both end up with following error: error: OpenCL version 1.1 does not support the 'static' storage class specifier
Both systems use the opensource in-kernel amd module, not the DRM garbage. Installed systems had a VEGA (this is probably normal that its not working there without DRM) but also a polaris RX580 GPU.
In my eyes, this driver isn't working. reverting to V17 on the polaris GPU solves the problem.
Cool! Thank you for the explanation! I will use the GitHub one!
Thank you, again!
For workstation requirements, I'd recommend to use the actual amdgpu-pro driver stack on a supported distribution.
However, if you like to do this, it should work with amdgpu-enabled FirePro/Radeon Pro cards. You are free to use the 552542 PKGBUILD from github (download the PKGBUILD file and run makepkg -si
where it's located).
No, not xf86-video-amdgpu. The actual GPU driver is inside the kernel, hence already installed if you use Linux. The stack would look as follows:
mesa|xf86-video-amdgpu|opencl-amd
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
libdrm
^^^^^^
DRM
where the drm driver (amdgpu in this case) is, as already mentioned, inside the kernel. libdrm will be installed as a dependency if you install this package. mesa would provide OpenGL, Vulkan, video acceleration and is not required to run OpenCL. Neither is xf86-video-amdgpu.
Ahh... was this intended for Radeon™ R7 300, R9 295X2, R9 280X, R9 280, R9 270X, R9 270, R7 265,HD 8700 - HD 8900, and HD 7900 Series? Then 511655 is the lastest version.
Is it possible for me to use this for FirePro products (FirePro's latest driver is 552542) as well?
Regarding the driver, I am just trying to install GPU Opencl runtime and only use OpenCL with thte GPU -> would this work? What do you mean by 'I already have the GPU driver installed'? Do you mean xf86-video-amdgpu?
I have the PKGBUILD updated and ready on my github (https://github.com/grmat/opencl-amd/) but still haven't seen the updated version on the AMD page, they still offer me the 511655. Can you point me to which link or form input you're using? In the meantime you can use the PKGBUILD from github, but I'll keep the assumption it has been withdrawn for now.
The direct download works if (and only if) you set a valid HTTP referer, it has already been discussed below.
I'm using the Ubuntu variant to create this package.
You already have the GPU driver installed. It's in the kernel. What you don't need is a graphics-related driver or runtime, as for OpenGL (provided by radeonsi in mesa) or DDX (xf86-video-amdgpu).
Yes, AMD released new version of the driver. I'll flag the package.
Correct me if I am wrong. URL download never worked for me. So I had to download it manually from the website via following link : https://support.amd.com/en-us/download
Also there are many versions for different distros from above link, but I don't know which one to use (maybe Ubuntu ones since pkgbuild shows .deb file?).
BTW, does this package work independently for OpenCL purpose only? as in I don't have to install the GPU driver? Or do I need to install the driver as well in order for me to make OpenCL work?
@redshoe: sure, didn't notice there was a new version. In that case, you can just flag the package with the link on top of this page and I'll look for the new version.
However, while the download works, I don't find the 552542 in their download section, can you point me to that? Because it looks to me like they might have withdrawn this build.
@grmat: Okay. But is it possible to use the lastest version (17.50-552542) of AMDGPU-PRO downloads from AMD? Not the 17.50-511655.
@faddat: Honestly, I don't know. I'm not a so called "trusted user" (https://wiki.archlinux.org/index.php/Trusted_Users) and I haven't sought to be one yet. Also, it's my only package in the AUR. Unless some other TU would like to adopt the package, it's not likely to happen soon.
@redshoe: If you are already so sure that the file is alright, you can skip the checksum test (--skipchecksums). You don't have to modify the PKGBUILD first.
Any chance we'll see this moving to community any time soon?
I use this package from inside a chroot, and have to dance around quite a bit (create a user to build it, use pacman -U to install the package) to include this in my build script. It's pretty crucial for anyone using Arch and AMD to mine.
Thanks
If you can't get it working try this.
Download the driver directly from AMD website Ubuntu or Debian and place it with PGKBUILD.
You can check its sha256sum by ...
$ sha256sum amdgpu-pro-17.50-(version-that-you-downloaded).tar.xz
and it will give you the sha256sum, and then copy/paste it into PKGBUILD.
@graudeejs: It does work when you set the HTTP referer correctly. That's why there is a DLAGENT set in the PKGBUILD (wget with correct referer, cURL has a --referer option too, if you prefer cURL).
Hello! I just want to warn you that tar download doesn't work. You have to manually download tar from https://support.amd.com/en-us/kb-articles/Pages/Radeon-Software-for-Linux-Release-Notes.aspx and place it in same directory as PKGBUILD.
No, as Vega runs OpenCL via the ROC stack, so rock/amdkfd+roct+opencl instead of amdgpu+libdrm+opencl. You'll have to look for ROCm builds/variants. Those should work without switching to the pro stack, it's open source but not everything is upstreamed afaik, so you'll maybe need to switch kernels and llvm as well.
is there a package of this that works with vega without switching to full pro stack?
I've also received a out-of-date note in Russian language, which I don't understand and online translation doesn't help either. I just know it was something about checksums. I can't reproduce the problem though and the checksum is correct every time I download the driver package. Please double-check if your download was successful and put more detail in your reports (in English please). You can also always have makepkg skipping the checksum verification if you wish.
There is a problem with the sha256sums after downloading amdgpu-pro-17.50-511655.tar.xz
Vega in fact doesn't work with this. Vega runs OpenCL on ROCm (see my comment from 2017-11-03), not the "old" stack, which is targeted here.
aoowweenn
Thanks for your response. I tried that but clinfo reports:
Number of platforms 0
I'm starting to think that VEGA does not work with this.
Thanks a lot for you answer wandinstallation, your suggestion worked to fix this error for me, but Blender do not found any GPU "Compute Device" in its preferences despite that (On a Radeon R9 290)...frustrating !
dpack and nylnook
What resolved the issue for me was to blacklist the radeon module
sudo vim /etc/modprobe.d/blacklist-radeon.conf
and add
blacklist radeon
save and restart
Atraii and arakmar
AFAIK, you have to set OCL_ICD_VENDORS in your .bashrc or .xprofile.
For directly use:
$ OCL_ICD_VENDORS=amdocl64.icd clinfo
Same issue than dpack, I'm also intersted in the solution ;)
The same as Atraii, no platform found with clinfo. Anyone manage to fix this issue ?
When I try to run the clinfo command I get this error.
amdgpu_device_initialize: DRM version is 2.50.0 but this driver is only compatible with 3.x.x. Segmentation fault (core dumped)
How can I fix this? My Arch is fully up to date and I'm using kernel version 4.14.8-1-ARCH x86_64
just got time to look at the issue. In fact, some paths have changed (thanks @Noctivivans). I didn't experience regressions yet, so I pushed the update out. Sorry for the delay
@utsi: have you checked reason of segfault? I noticed that paths in 17.50 are little different which caused segfault of e.g. clinfo.
Here is my quick-ugly-fixed PKGBUILD (in my version amdgpu.ids is in /opt/amdgpu instead of /opt/amdgpu-pro, I am not sure if that's correct way to do it but at least it works; tested on xmr-stak and clinfo):
works with 4.14.8-1-ARCH kernel and amdgpu driver (without binary blob), I have no other opencl packages installed.
I'm confused on how to get the icd to load this. clinfo reports only OpenCL 1.1 Mesa 17.3.0 on Clover. I don't have the newer OpenCL 1.2. Is there a dependency I'm missing?
Edit: Looks like I'm not the only one https://bbs.archlinux.org/viewtopic.php?id=232446
I tried modifying the package to get 17.50 to work but sadly OpenCL apps segfault when you start them, so it is not just you :\
I know 17.50 has been released this week, but it currently seems broken with my setup.
I have no data for other setups. If some testers could tell me everything's fine with their newer hardware or an amd kernel, I'd push out 17.50 to the aur.
Pinned Comments
luciddream commented on 2021-12-26 15:14 (UTC) (edited on 2022-05-11 17:54 (UTC) by luciddream)
Hi all, current release is for driver version 22.10.2 and ROCM 5.1.2.
opencl-amd
package includes only OpenCL / HIP runtime. You also need to use opencl-amd-dev package for ROCm LLVM compiler, OpenCL and HIP SDK.Please relog / reboot after installing so your PATH gets updated
sperg512 commented on 2021-03-24 13:27 (UTC) (edited on 2021-03-24 15:02 (UTC) by sperg512)
ok so as @ATrigger and @quimkaos pointed out, 20.50 doesn't work on Polaris GPUs. If you have one, downgrade to 20.45 or install opencl-amd-polaris. could also be Mesa being fat but for now just downgrade or install the other package ok thanks