Thanks,ajgringo619. Mine's AMD, so that might help narrow it down.
ETA: The problem was having the opencl-amd and opencl-mesa packages installed concurrently. Back to normal after removing the opencl-mesa package...
Git Clone URL: | https://aur.archlinux.org/foldingathome.git (read-only, click to copy) |
---|---|
Package Base: | foldingathome |
Description: | A distributed computing project for simulating protein dynamics |
Upstream URL: | https://foldingathome.org/ |
Keywords: | folding foldingathome |
Licenses: | custom |
Submitter: | dtw |
Maintainer: | rustymech |
Last Packager: | rustymech |
Votes: | 168 |
Popularity: | 0.168643 |
First Submitted: | 2007-06-28 14:55 (UTC) |
Last Updated: | 2021-01-09 21:16 (UTC) |
Thanks,ajgringo619. Mine's AMD, so that might help narrow it down.
ETA: The problem was having the opencl-amd and opencl-mesa packages installed concurrently. Back to normal after removing the opencl-mesa package...
Using two: GTX 1050 and 1070
@ajgringo619: what GPU are you using?
Interesting...I upgraded this morning, rebooted, and FaH is working with the new llvm-libs package.
Updated llvm-libs to 13.0.1-3...now GPU does not function properly.
After running this client for nearly a month, I just discovered the foldingathome-nvidia.service. Can you please explain what it's for and why it's required? My (2) GPU system has been running fine without it, although I did enable it today.
I think you need to add
StandardOutput=null
To the service file; otherwise the client logs to both /var/log/foldingathome/log.txt and the journal.
@geripgeri Thought that directory existed even with no graphics card, made it optional.
Hey @jpkotta!
/dev/dri
doesn't exist in my your system, /usr/bin/chown
and /etc/foldingathome
both exist.
After I added ReadWritePaths=-/dev/dri
to foldingathome.service
the service starts without any issue!
Thank you for your help!
@geripgeri: Does /dev/dri
exist on your system? If not, try editing foldingathome.service
to have ReadWritePaths=-/dev/dri
(this should probably be changed in the package, you shouldn't need /dev/dri
if you're doing CPU folding).
As for the chown
, that doesn't make sense. Do /usr/bin/chown
and /etc/foldingathome
both exist?
Hey! I'm getting this error message: The system is up to date. I don't have GPU in this pc, FAH worked for me before.
systemd[1]: Starting Folding@home distributed computing client...
systemd[2846840]: foldingathome.service: Failed to set up mount namespacing: /run/systemd/unit-root/dev/dri: No such file or directory
systemd[2846840]: foldingathome.service: Failed at step NAMESPACE spawning /usr/bin/chown: No such file or directory
systemd[1]: foldingathome.service: Control process exited, code=exited, status=226/NAMESPACE
systemd[1]: foldingathome.service: Failed with result 'exit-code'.
systemd[1]: Failed to start Folding@home distributed computing client
What could cause this issue?
@alucryd: I realise I should've tried this before, either way it does not help. Moved back sleep to the main service and re-activated the nvidia one and I ended up having to run startx manually. (as graphical.target wasn't active)
EDIT: It's now happening with only foldingathome.service, so I guess it's the sleep command then, but it used to work before? Computers yo...
@katt: Can you try putting the sleep command back to the foldingathome.service unit?
I've just discovered that with the foldingathome-nvidia service activated, graphical.target stays "inactive (dead)" until I've logged out and back in. This causes problems such as xorg not autostarting (https://wiki.archlinux.org/index.php/Xinit#Autostart_X_at_login).
This didn't happen a while back so possibly has something to do with the moved sleep command?
Yes, fully updated. After a sudo useradd fah
, the service starts fine.
Correction: I found an /etc/nsswitch.conf.pacnew. My existing /etc/nsswitch.conf didn't allow systemd to control users and groups. I fixed that, removed the manually-created user, and the service started normally.
@AntiComposite: Those should be created automatically by systemd, and automatically deleted when the service exits. It's working fine here, is your system up to date? Can you try creating these manually?
Trying to start the service with sudo systemctl start foldingathome
fails for me with chown[60820]: /usr/bin/chown: invalid user: ‘fah:fah’
.
shellcheck shows "SC2206: Quote to prevent word splitting/globbing, or split robustly with mapfile or read -a" warning for the pkgver variable in the download url.
@Wild_Penguin I was planning to move it to the nvidia unit with the next update indeed, like you said it's most likely unnecessary for any AMD GPU.
@noalwin: Nope, we're using a dynamic user, there's no need for that.
Hi,
About the PreExec : 60 seconds delay
I hope this hack / workaround is temporary. I've decreased the default timeouts for all systemd units, since I consider anything taking tens of seconds to start as failed. Hence, foldingathome will never start with the current way. Also, being an AMD user, I can not have problems with CUDA. Maybe the delay should be added to foldingathome-nvidia instead, and have foldingathome depend on that (untill a less-hacky workaround is found)?
Just my comments.
Please, add to the post install and post upgrade message that users should add the "fah" user to the "video" group (gpasswd -a fah video) or maybe the script should do it.
Thanks @alucryd and @Superice97 - that update works for me!
@rsa: Done.
@Superice97: Was hoping to do without, but I added it back. Thanks for the heads up.
@curtispf: Does it change anything for you?
My GTX1050 stopped working between 7.6.9-7 and 7.6.9-8. Adding ExecStart=/usr/bin/clinfo
back to foldingathome-nvidia.service
fixes it.
I've lost GPU folding on updating to the latest release :(
My logs show the following rather interesting few lines:
18:58:25: GPUs: 0
18:58:25: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:10.2
18:58:25:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.82
Not wholly sure how it's working that one out... anyone have any ideas? I have a Nvidia GeForce GTX 1060. foldingathome-nvidia is enabled, and says "Succeeded" in the logs. The 60 second wait is working, too, as I've watched it happening.
May you replace the dependencies opencl- with opencl-driver? It's a generic name that all opencl- packages provide, however you miss several providers from the array, it would be cleaner to use opencl-driver straight away :) see foldingathome-beta
In the end, to update I didn't wait for the WUs to finish, since the new versions were basically contained the changes you already suggested and that I already made.
Thus, I can confirm that version 7.6.9-8 works flawlessly for me! :)
@alucryd: 7.6.9-8 still working fine for me :)
Alright, I added the 60s sleep and removed the clinfo line. Hopefully this make it work for everybody.
@alucryd: I removed the clinfo invocation and everything is still working
@keithy @Pezlu Awesome, thanks for the feedback! 60s is probably fine for most, if not all people. One last thing I'd like you to try if you find the time, is to remove the clinfo invocation from the nvidia service, I did on my desktop, and the delay was enough to get everything working.
@alucryd: I'm still on 7.6.9-6, and after adding ExecStartPre=/usr/bin/sleep 60 to the service file everything works fine! I've not fiddled (yet) with the delay length, however. I can try reducing it later if needed.
PS: I'll update to the latest version after finishing the current WUs... considering my ancient GPU, it will take a while!
@alucryd: working perfectly now with 7.6.9-7 - Thanks :)
FYI, waiting for 60s works on my desktop.
@YanDoroshenko: And thank you for your contribution to solving the issue, whining and being a smartass is always helpful :) I suggest you write to the arch-security mailing list explaining why you think, rather why you know, it's a good idea to run this as root ;)
@keithy: Oh, that's probably me, please try the latest revision.
@Pezlu: Almost there! I've been reading on the f@h forums that there might be some race condition, and just waiting a bit before actually starting FAHClient could be enough to get it working. Can you try adding ExecStartPre=/usr/bin/sleep 60
to the unit file and report back, and maybe play with the delay a bit? I'll do the same on my desktop.
Thanks, great job on keeping my system secure! Yesterday I was able to get the GPU to work running sudo FAHClient, but I'm not anymore, so no more nasty proprietary code running with superuser access.
@alucryd: I just updated to 7.6.9-6 and enabled the foldingathome-nvidia service. I tried to reboot only once, but the issue seems partially solved: OpenCL is detected, while CUDA is not:
19:12:42: GPUs: 1
19:12:42: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:3 GK106 [GeForce GTX 660]
19:12:42: CUDA: Not detected: cuInit() returned 100
19:12:42:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.82
Restarting the foldingathome service corrects the issue as before:
19:15:36: GPUs: 1
19:15:36: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:3 GK106 [GeForce GTX 660]
19:15:36: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:3.0 Driver:10.2
19:15:36:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.82Driver:440.82
@alucryd: My issue actually appears to be entirely to do with foldingathome.service not starting.
During startup I get this error
systemd[1]: multi-user.target: Found ordering cycle on foldingathome.service/start
systemd[1]: multi-user.target: Found dependency on foldingathome-nvidia.service/start
systemd[1]: multi-user.target: Found dependency on multi-user.target/start
systemd[1]: multi-user.target: Job foldingathome.service/start deleted to break ordering cycle starting with multi-user.target/start
Thanks, I hope this doesn't create more confusion
@cubethethird: right apologies, this is more of a hack and I hope to find a better way to initialize opencl for nvidia in the future, but for now I've added it to optdeps
It seems there is now a missing dependency for clinfo that is needed for the foldingathome-nvidia service.
@keithy: I moved the logs to /var/log/foldingathome so it's easier for you guys to access logs.
Also removed the shipped GPUs.txt, looking at the log the download failure has been fixed upstream.
@keithy can you get the full log of foldingathome-nvidia with journalctl? And what does foldingathome say? There's really no reason why even CPU folding wouldn't work after these changes.
@Buddlespit: don't know what to say, it works on both my 1050 and my 1080Ti, well enjoy giving root access to your whole system to some closed proprietary code then
@alucryd: No change with 7.6.9-4. Neither CPU nor GPU fold until I restart foldingathome.service
This is the status of the foldingathome-nvidia service in case it sheds some light!
foldingathome-nvidia.service - Folding@home helper for NVIDIA GPUs
Loaded: loaded (/usr/lib/systemd/system/foldingathome-nvidia.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Tue 2020-04-21 15:07:10 BST; 11min ago
Process: 748 ExecStart=/usr/bin/nvidia-modprobe (code=exited, status=0/SUCCESS)
Process: 763 ExecStart=/usr/bin/nvidia-modprobe -c 0 -u (code=exited, status=0/SUCCESS)
Process: 794 ExecStart=/usr/bin/clinfo (code=exited, status=0/SUCCESS)
Main PID: 794 (code=exited, status=0/SUCCESS)
Apr 21 15:07:10 ryzen-antergos clinfo[794]: clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
Apr 21 15:07:10 ryzen-antergos clinfo[794]: clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) Invalid device type for platform
Apr 21 15:07:10 ryzen-antergos clinfo[794]: clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No platform
Apr 21 15:07:10 ryzen-antergos clinfo[794]: ICD loader properties
Apr 21 15:07:10 ryzen-antergos clinfo[794]: ICD loader Name OpenCL ICD Loader
Apr 21 15:07:10 ryzen-antergos clinfo[794]: ICD loader Vendor OCL Icd free software
Apr 21 15:07:10 ryzen-antergos clinfo[794]: ICD loader Version 2.2.12
Apr 21 15:07:10 ryzen-antergos clinfo[794]: ICD loader Profile OpenCL 2.2
Apr 21 15:07:10 ryzen-antergos systemd[1]: foldingathome-nvidia.service: Succeeded.
Apr 21 15:07:10 ryzen-antergos systemd[1]: Finished Folding@home helper for NVIDIA GPUs.
To be clear, nvidia users need to enable both foldingathome.service
and foldingathome-nvidia.service
.
@keithy: the latest foldingathome-nvidia service consistently works for me on my desktop
@Buddlespit: Did you read the latest install notes and enable foldingathome-nvidia? Did you try restarting the service like others do?
@alucryd:
Sorry, 7.6.9-3 is a backward step for me.
Now neither CPU nor GPU fold until I restart foldingathome.service.
After restart only CPU folds :(
Edit: GPU does fold as well after service restart - it was just trying to download another WU
@xuanruiqi: Awesome, we're getting there.
@Pezlu @keithy: I can't find confirmation in the systemd documentation, but I believe ReadWritePaths is evaluated before ExecStartPre, so the first time those paths don't exist so systemd doesn't give you access to them.
I've added a new service named foldingathome-nvidia.service, if you enable both this one and foldingathome.service, the nvidia service should run first, and by the time foldingathome starts, the paths should be there. Please let me know how it goes.
@alucryd: OK, my GPU is working again...
@alucryd: I just updated to 7.6.9-2 and the client behaviour didn't change, I still have to restart the service. I'm on a desktop.
@xuanruiqi: Wonder why my 1050 works fine, may depend on the workload :/ I've added /dev/nvidia0 and /dev/nvidiactl to ReadWritePaths, please let me know if that changes anything.
@keithy: Thanks for the feedback, I'll keep looking into this as well.
@alucryd
It worked this morning without having to restart service. Sadly that means its random. Don't have enough data points to tell, but guessing from what I've seen so far its 50/50 as to whether it starts or not.
I'll try the systemd service alteration at some point and see if it makes any difference.
@alucryd: Yes I have all the requirements (cuda, ocl-icd, opencl-nvidia) and it worked perfectly before the changes.
@xuanruiqi: Did it work before the recent changes? Did you install all the requirements for opencl? I have the exact same card on my server and it works just fine.
@keithy: Hmm, what if you replace After=multi-user.target
by After=default.target
in the systemd service?
I'm still having trouble with my GPU (NVIDIA GTX 1050), and restarting the service doesn't help. It just doesn't detect my device. Can confirm I have latest ver of the package.
@alucryd
I have the same issue as Pezlu - service restart fixes it.
Nvidia 1050 Ti on desktop
running latest 7.6.9
Thanks
Updated to 7.6.9.
@Pezlu: Is that with the very latest changes to the systemd unit? Desktop or server?
@Emil, Thanks, in an effort to make cleanup easier, I'd rather have everything in .config/fah so I added a unit to reflect that.
@igorselsking: The config is in /etc/foldingathome now. As for it to be reset, that's not exactly possible, although foldingathome itself modifies it and rotates it periodically...
@mnd999: Nvidia or AMD? Just added /dev/dri to ReadWritePaths as several sources indicate opencl on AMD needs direct access to the card. Please let me know if that works for you.
My GPU is also now not working. It was working perfectly before the config files and logs started moving around for no reason.
I think the configuration gets reset when I restart the client. It's that possible or I'm doing something wrong?
This is the systemd user service I use to make idle detection work:
~/.config/fah/config.xml
and ~/.local/share/fah/
need to be created first.
[Unit]
Description=Folding@home distributed computing client
After=network.target
[Service]
Type=simple
WorkingDirectory=%h/.local/share/fah
ExecStart=/opt/fah/FAHClient --config %h/.config/fah/config.xml --exec-directory=/opt/fah --data-directory=%h/.local/share/fah/
[Install]
WantedBy=default.target
Hello, I'm using a nVidia Geforce GTX 660 and I was using the root version before the latest updates. With the latest version (7.6.8-7) everything works almost as intended.
However, right after boot, CUDA and OpenCL are not detected, so the GPU does not fold. I get:
08:55:49: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:3 GK106 [GeForce GTX 660]
08:55:49: CUDA: Not detected: cuInit() returned 100
08:55:49: OpenCL: Not detected: clGetPlatformIDs() returned -1001
If I restart the service everything is recognized and works as intended:
10:28:36: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:3 GK106 [GeForce GTX 660]
10:28:36: CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:3.0 Driver:10.2
10:28:36:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.82
It seems to be a timing issue. As a workaround I'm restarting the service after boot with a startup script.
Figured I'd also poke in and say it works perfectly with the latest (7.6.8-7) changes. I'm using startx and having no issues with this starting "too soon". nvidia gpu.
not sure why, but I guess when the opencl entity is started too soon, it blocks the GPU from being accessible for display. Not sure if that is also true for nvidia cards, might be that the timing plays a huge role, and that this timing is different for Nvidia.
When using opencl-mesa, starting an OpenCL workload breaks the display output completely (image is not updated anymore - that information was last checked by me about 2 years ago so that might not be true anymore), that behaviour does not happen with opencl-amd. But that behaviour leads to my conclusion in the first sentence of this post.
Awesome, thanks for the heads up.
@alucryd Can confirm that default.target works. Hopefully that will work for a while.
WantedBy=default.target
works on my server, it should be the same as WantedBy=graphical.target
on your desktops.
Somehow After=multi-user.target
still works when default is multi-user.
@alucryd sadly cannot confirm on headless server. Sorry should have included that in the previous comment.
@Takei: On a server? I'll probably go with it, but will change WantedBy=graphical.target
to WantedBy=default.target
.
@BS86 @alucryd Can confirm works with the settings from BS86.
@lesto: Please have a read, look for DynamicUser: https://www.freedesktop.org/software/systemd/man/systemd.exec.html
@andrej: You can't trust most people to know how to do that, this must come with the package. I added those lines with both !
and -
. That way nvidia will work out of the box and non nvidia systems will not be impacted. If this fails on an nvidia system you probably have bigger problems to deal with anyway. Also, nvidia-modprobe is setuid, so anyone can invoke this binary as root, but it seems systemd still applies some restrictions to it so !
it is.
@BS86: I still can't figure out why the video group is needed since virtually all the filesystem is read-only thanks to DynamicUser, unless amd has some files needed for opencl that are no world-readable? Also, before I add WantedBy=graphical.target I want to make sure it still works fine on a headless server.
@YanDoroshenko: You might want to read the comments.
I'm unable to use GPU since today's update. I've tried everything I was able to find (and a couple of my own inventions) - all quite fruitless. Running FAHClient via sudo does the trick and GPU picks up right away with the same config and same command.
And Success:
With After=network.target multi-user.target
and WantedBy=graphical.target
booting now works as expected.
Tried After=network.target graphical.target
and WantedBy=multi-user.target
first, but would not work due to obvious cycle-detection. System booted, but FAH service was not loaded.
ok, with SupplementaryGroups=video
, OpenCL and FAH works and does find my Vega64 running with opencl-amd, but on reboot, the system breaks like expected. Had to chroot to get the system back working.
Will play around a bit with the WantedBy
and After
options to get fah loaded after X.
@alucryd As for NVidia, it's better to add a drop-in instead of changing the unit file. Also, I think the prefix should be !
, not -
. (A failure is not OK and, AFAIK, only root can modprobe.) This is only relevant for people with NVidia GPUs anyway.
On my system I've created /etc/systemd/system/foldingathome.service.d/modprobe.conf
containing this:
[Service]
ExecStartPre=!/usr/bin/nvidia-modprobe
ExecStartPre=!/usr/bin/nvidia-modprobe -c0 -u
(The first nvidia-modprobe is a no-op if you run a desktop or otherwise load the module automatically.)
The drop-in will show up in the status (after systemctl daemon-reload
etc.):
● foldingathome.service - Folding@home distributed computing client
Loaded: loaded (/usr/lib/systemd/system/foldingathome.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/foldingathome.service.d
└─modprobe.conf
Active: active (running) since Fri 2020-04-17 22:51:46 CEST; 10min ago
Process: 15205 ExecStartPre=/usr/bin/chown -R fah:fah /etc/foldingathome (code=exited, status=0/SUCCESS)
Process: 15207 ExecStartPre=/usr/bin/nvidia-modprobe (code=exited, status=0/SUCCESS)
Process: 15209 ExecStartPre=/usr/bin/nvidia-modprobe -c0 -u (code=exited, status=0/SUCCESS)
Main PID: 15212 (FAHClient)
Tasks: 65 (limit: 77025)
CGroup: /system.slice/foldingathome.service
...
in the service file you do: "/usr/bin/chown -R fah:fah /etc/foldingathome" I think this should be done by the install script instead, so it is done once and that is it.
also my xt5700x is seen as 5600, not sure where this GPUs.txt come from
What is a newer or older AMD GPU? I has a Radeon R7
@alucryd sorry but it doesn't seem to work. Still the same error code even after another restart of the service. Anyways i'm off to bed.
Nevermind that, cuda depends on it. Looking at the manpage, nvidia-modprobe -u
should create the device. Not sure if it's right to automatically call it with fah though, but can you try adding ExecStartPre=-/usr/bin/nvidia-modprobe -u
to the systemd service?
Do you guys have nvidia-utils installed? It contains nvidia-modprobe which is a setuid binary that can create this device without sudo.
@Takei I really don't know what this application does for CUDA, I just know it works really. https://github.com/DeadSix27/waifu2x-converter-cpp
Anyway the file is only created when I've run the the application using CUDA.
I never ever use cuda, still /dev/nvidia-uvm is automatically created for me, so yeah it works even after a reboot :/
@katt if true that would make it atleast possible to make this run as a normal user. Could you see if you can find something in that application that might help us out? Maybe it has a service file or something else from what we could get the info we need.
@alucryd Added just ReadWritePaths=/dev/nvidia-uvm
and SupplementaryGroups=video
and it seems to be working now, not quite sure why it was complaning that /dev/nvidia-uvm didnt exist, havent even rebooted since..
Now to just get a new WU ;p
18:32:32: CUDA Device 0: Platform:0 Device:0 Bus:9 Slot:0 Compute:6.1 Driver:10.2
18:32:32:OpenCL Device 0: Platform:0 Device:0 Bus:9 Slot:0 Compute:1.2 Driver:440.82
EDIT: Maybe related to why it suddenly "showed up", just realised I was doing some CUDA work earlier, that's probably why. (that application was not run as root though)
@alucryd tried sudo modprobe nvidia_uvm just now. Does not create the device.
@alucryd did you also try after a reboot?If you use cuda once per session as root, it will also work for a normal user,but only root can create the device.
Just tried with ReadWritePaths=/dev/nvidia-uvm after reboot, did not work. Errormsg foldingathome.service: Failed to set up mount namespacing: /run/systemd/unit-root/dev/nvidia-uvm: No such file or directory
Actually this should work even without KMS, does sudo modprobe nvidia_uvm
create the /dev/nvidia-uvm
device?
@katt, @Takei: Tried on my desktop with a 1080Ti, I can reproduce.
I'm not exactly sure, but it seems you need /dev/nvidia-uvm, seems to be linked to KMS: https://wiki.archlinux.org/index.php/NVIDIA#DRM_kernel_mode_setting
Then I added it to ReadWritePaths and restarted foldingathome and it worked fine.
Can you try that before I add it to the service file? Would be great if someone could write that down on the wiki as well.
@katt Same problem as me. Only way i was able to solve it till now was to let the client run as root. Otherwise it wasn't able to create the necessary device and access cuda.
@alucryd I added /dev/nvidia0 and /dev/nvidiactl (the others dont exist) and still didnt work. Here's the output of that command though: https://gist.github.com/Kattus/70f88c424bc6450f1337ddd3fc5cb606
@Takel
17:50:38: CUDA: Not detected: cuInit() returned 999
17:50:38: OpenCL: Not detected: clGetPlatformIDs() returned -1001
You can also try installing clinfo and run sudo -u fah clinfo
. Might give us some insight.
@katt What is the output for the cuda and opencl device in the log?
@katt: If that still doesn't work, you may also try the following devices:
/dev/nvidia0 /dev/nvidiactl /dev/nvidia-uvm /dev/nvidia-uvm-tools
You can add as many ReadWritePaths lines as you want, they stack up.
@jpkotta: Thanks, will switch to the newer syntax.
@katt: Can you try adding ReadWritePaths=/dev/dri to the systemd service file, on top of the SupplementaryGroups stanza?
@jpkotta Adding SupplementaryGroups=video did not help unfortunely.
Still getting this:
17:30:16:ERROR:WU00:FS01:Failed to start core: OpenCL device matching slot 1 not found, make sure the OpenCL driver is installed or try setting 'opencl-index' manually
Yes, opencl-nvidia is installed and it worked perfectly before all this.
@zegkljan: chown -R it is then.
@zero456: sigh... thanks for the heads up, shipping a GPUs.txt until upstream fixes it. That's what happens when you refuse to go open source.
@jpkotta, @BS86: I can use my GPU just fine without being in the video group, got a 1050 on my server. I never started this service as root. And no the fah user should not be created, that's what DynamicUser is all about. I can definitely add SupplementaryGroups though.
@alucryd PermissionsStartOnly is deprecated (https://github.com/systemd/systemd/blob/master/NEWS#L1878). Try ExecStartPre=!...
for the chown command.
@BS86 Not sure, my GPU is too old for F@H. Maybe an After= in the unit file would fix it?
And apparently, the user "fah" also does not exist on my system because it was not required by this package until now. Only the noroot-package needed it.
@alucryd: If the xorg breakage and GPU detection is solved with the previous comments, and you want to stick with the DynamicUser approach, might you please add a installation warning that the user "fah" needs to be created?
@jpkotta: Does this also solve the xorg - breakage mentioned in the Wiki?
@BS86 @alucryd
You need to add SupplementaryGroups=video
to the service file.
@alucryd There seems to be a regression in the latest version that is causing folding@home to not download the GPUs.txt file. This is what it uses to detect what GPUs it can and cannot use. Without it, it will not use any GPU.
Copying from the old directory /opt/fah/GPUs.txt
to /etc/foldingathome
fixes the issue.
See: https://github.com/FoldingAtHome/fah-issues/issues/1363
New version breaks folding@home for Nvidia gpus, because only root can create the device necessary to use cuda. Atleast according to what i found so far. Problem only starts after the first reboot, because the necessary device does not exist anymore. https://www.pgroup.com/userforum/viewtopic.php?t=4587
GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:8 GP104 [GeForce GTX 1080] 8873 CUDA: Not detected: cuInit() returned 999 OpenCL: Not detected: clGetPlatformIDs() returned -1001
Log when FAHclient is started as root: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:8 GP104 [GeForce GTX 1080] 8873 CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:10.2 OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.82
apparently, foldingathome switched to the noroot - approach with DynamicUser which breaks GPU usage (check Wiki for details: https://wiki.archlinux.org/index.php/Folding@home#Run_f@h_with_limited_privileges)
would be great to have this package here in the future, too (if foldingathome copies the approach from here, that would work to)
For some reason, with that commit today (Version 7.5.1-2) : https://aur.archlinux.org/cgit/aur.git/commit/?h=foldingathome&id=d6dac582292af9d594c9d8bf7f9b6a4f8e56fa4c
This package switched to the DynamicUser approach which is troublesome, check the wiki: https://wiki.archlinux.org/index.php/Folding@home#Run_f@h_with_limited_privileges
Quote:
To use your graphics card for this task the fah user has to be in the required group (video). But doing this will also cause the foldingathome.service to start before Xorg and breaking it in the process.
The last working version for me is 7.5.1-1
@alucryd The chown way is definitely the way to go but it still does not work because the directory /etc/foldingathome
is owned by root and therefore FAHClient cannot remove the old configuration when it needs to update (yes, it first removes it and then writes a new one). There are 16:31:39:WARNING:Exception: Failed to remove '/etc/foldingathome/config.xml': Permission denied
warnings in the log when it tries to update.
@alucryd
Thanks thats fixed it.
@DNAblue2112 GPU works fine here, please describe how you can't add it. The minimal provided config should auto discover any GPU.
@keithy: remove the fahclient package, I don't know where you got that from.
Yes the default power setting is coming directly from their sample configuration, which is a sane default. Feel free to crank it up if you want.
Also, how the software itself works is not good, trying to make it sane is not exactly straightforward.
I changed the service again, it will chown the config in /etc/foldingathome so that it is writable to the fah user, now modifications by the software itself will be remembered, and we can still benefit from config backups, and potential changes in defaults from upstream.
Please let me know how it goes.
I think the default power setting changed. Set <power value="medium"/>
in config.xml. For me, the CPUSchedulingPolicy in the systemd unit didn't affect it nearly as much as the power setting.
Unfortunately installing the most recent versions has totalled everything. I cant add my GPU even after downgrading to the old versions, and file permissions got screwed over so no changes stay applied. So now I can only use FAH on my laptop where I haven't updated yet because updating has made everything fall apart on my desktop.
The new way of handling config.xml
is not good. If I change the configuration using FAHControl, it is updated only in the working directory /var/lib/fah
but not in /etc/foldingathome
.
Also, for some reason the utilized CPU power is now miniscule compared to the prior version and when it was running as root (I'm not sure which one of these is the cause).
EDIT: the cause is the CPUSchedulingPolicy=idle
. If I comment it out, the utilized CPU power is back to where it was before the update.
I get
error: failed to commit transaction (conflicting files) foldingathome: /usr/bin/FAHClient exists in filesystem (owned by fahclient) foldingathome: /usr/bin/FAHCoreWrapper exists in filesystem (owned by fahclient) Errors occurred, no packages were upgraded.
How to fix please??
Thank you
Bumped to 7.6.8 and added a warning about the new config location.
Maybe add a warning that people will have to manually migrate their configs etc? Just got someone yelling at me for breaking their stuff in fahcontrol ;D
@katt it is, but are you sure this is the one that should be removed? The name of this package matches upstream.
EDIT: Yes this probably should be removed, but foldingathome
should probably be named fahclient
instead
@jpkotta I just took over the regular foldingathome package, it no longer runs as root as you did, making this package redundant so I'll merge it in foldingathome.
Thanks for maintaining this until now!
Adopted and updated to not run as root. I'll make a systemd user service for those who rely on idle detection when I've got some time.
Isn't this a duplicate of the foldingathome package?
@jpkotta: batch
should be used if you have lots of idle
tasks so they don't starve of CPU. In a "normal" desktop PC I'd say idle
and batch
should operate the same.
@artafinde I added the idle policies. FAH already lowers its priority with nice, but setting the scheduling policy made a small but measurable difference. The difference was much bigger with CPUSchedulingPolicy=batch, but I don't understand it well enough to say it's a good idea.
Consider adding the below on the service
[Service]
IOSchedulingClass=3
CPUSchedulingPolicy=idle
Found the issue, I had the opencl-headers
installed and I think the program was confused.
I'm getting an error for GPU folding
Apr 09 10:52:39 tiamat FAHClient[19926]: 09:52:39:ERROR:WU01:FS00:Failed to start core: OpenCL device matching slot 0 not found, try setting 'opencl-index' manually
My GPU is nvidia 1080 with proprietary drivers installed and working and the fah
user is in video
group. I've tried setting the opencl-index
to -1
and 0
without luck. Any suggestions welcome.
You can make a systemd user service as a workaround/solution for this problem.
When running FAH as a service using the systemd file included here, the idle detection used in FAHClient breaks. See this thread for more detail: https://foldingforum.org/viewtopic.php?f=61&t=33234
We're still waiting for permission to distribute, this has been brought to F@H's attention.
I'd love to see the changes that were in the temporary official packages applied here, currently using that one instead as it's just overall better and didn't use root.
https://git.archlinux.org/svntogit/community.git/log/?h=packages/foldingathome
Hello, what's the difference with the other packages? why there is so many?
Could you add fahviewer and fahcontrol as optional dependencies, which are the protein viewer and the GUI?
Could you add rocm-opencl-runtime, opencl-amdgpu-pro-orca, opencl-amdgpu-pro-pal as optional depencies to complete the OpenCL options?
Your comment about opencl-mesa and opencl-amd are technically not right. Please, if you are to add comments, calk them on ArchLinux's Wiki: - opencl-mesa: free runtime for AMDGPU and Radeon - opencl-amd: proprietary standalone runtime for AMDGPU (pal and legacy stacks in a single package) - rocm-opencl-runtime: Part of AMD's fully open-source ROCm GPU compute stack, which supports GFX8 and later cards(Fiji, Polaris, Vega) - opencl-amdgpu-pro-orca: proprietary runtime for AMDGPU PRO (supports legacy products older than Vega 10) - opencl-amdgpu-pro-pal: proprietary runtime for AMDGPU PRO (supports Vega 10 and later products)
@whatshisname
It's a user service, basically to handle it append --user
argument to systemctl
, e.g.
systemctl --user enable --now fahclient.service
.
More documentation is here: https://wiki.archlinux.org/index.php/Systemd/User
Thanks for putting up this package. This is an important project.
But no matter how I tried to run the service, I kept getting variations of "Failed to start ... service: Unit not found" error messages. This even when I tried to start the service as a user service using the "fahclient@myusername.service" syntax.
Finally, I copied the "/usr/lib/systemd/user/fahclient.service" file to "/usr/lib/systemd/system" and was able to enable and start the service.
@minore What command exactly are you running? The client runs under a special systemd environment (consisting of a user, PWD, and other state), and it won't work unless you run your command in that environment too. The fah-config script gives an example of how to do this. But when I want to pause I just systemctl stop foldingathome
. The client should gracefully shut down.
@PeXArtZ Looks like permissions issues. Make sure /var/lib/fah/*
is owned by user fah
.
@jpkotta I don't think there is anything useful, because I always need to restore the backup before I updated. But if you want to take a look: https://hastebin.com/hebixunito.md
Entering command --send-pause on a terminal window (for pausing all slots on an already running client) has no effect. Any ideas?
@PexArtZ What do the logs say (journalctl -b -u foldingathome
)? They're having trouble with the work servers due to increased demand.
Newest update prevents my install from booting. Anyone else having this problem?
Hi @eliasjackson, Thanks for report, fixed.
Hi @whatshisname,
I don't see any new binaries on project page.
@minore AFAIK the default is all diseases, so you don't need to do anything.
Thank you for this package. Is there a way to opt for a project, as in specifying the preference to contribute to "All Diseases" projects (the COVID-19 research)?
They can't put it in community yet. There was a comment (now deleted for some reason) that said there's a license issue and they're asking for permission.
This project has changed to focus on the coronavirus. If we download this version will it work for that purpose?
@andrej: that's actually normal according to this article: https://www.howtogeek.com/663539/how-to-fight-coronavirus-with-foldinghome-and-a-gaming-pc Also the version here is the same as the one in the .deb package on the official website, but it really should be moved to community
Yet another problem is in f@h itself: There are "cause preferences" in the control app, but coronavirus is not one of them!
+1 what @xuanruiqi said "Mother" Nature needs yet another big punch right in her face!
That also does the trick! Thanks for the quick response.
@direc85
What does ls -l /dev/nvidia*
show? If possible, reboot and run the ls command both before and after running that mknod script from nvidia.
@robinc @gourdcaptain
Can you try SupplementaryGroups=video
instead? That seems more correct. I don't have fancy GPU hardware so I can't test this.
@gourdcaptain This seems to be a permissions problem. Adding Group=video
to the service file solved this for me (AMD Vega 56). This is what my foldingathome.service
looks like now:
[Unit]
Description=Folding@home distributed computing client
After=network.target
[Service]
DynamicUser=yes
Type=simple
User=fah
Group=video
StateDirectory=fah
WorkingDirectory=/var/lib/fah
ExecStart=/opt/fah/FAHClient
[Install]
WantedBy=multi-user.target
@jpkotta Could you please update the package to include this fix?
+1 what @xuanruiqi said
+1 what @xuanruiqi said
@jpkotta It looks like foldingathome community package has been removed.
No, it isn't the 'video' group. Today I ran a script from NVIDIA CUDA Installation Guide for Linux and it seemed to have worked...
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#runfile-verifications
For whatever reason, one of the commands in the installed systemd service (ExecStartPre=/usr/bin/mkdir -p %h/.local/share/fahclient
) wasn't working, so I had to create that directory manually in order to get the service to work. Might be worth adding to the PKGBUILD.
I had hard time getting my NVidia GTX 1060 to work with OpenCL and/or Cuda. I kept getting OpenCL init error 999 and Cuda init error 1001. I think the final straw was adding user "fah" to group "video" and restarting the service...
There's a foldingathome in Community now: https://www.archlinux.org/packages/community/x86_64/foldingathome/. It uses basically the same systemd unit file that this package does, which is what enables running as a non-root user. I recommend you install that instead of this package. However, you're expected to hand edit the config.xml instead of using a script. That file is only quasi-human-writable, so I might just convert the fah-config script to its own package.
+1 what @xuanruiqi said
Anyone know why it would be failing to detect my AMD Radeon RX 5600 XT card for OpenCL? It shows in FAHControl a "Not detected: clGetDeviceIDs() returned -1". I have the opencl-amd package installed and the card shows up in clinfo (and vapoursynth plugins have been able to use it.)
Upvote for COVID-19 research. Everyone upvote and lets get on the science win train. Tell everyone.
+1 what @xuanruiqi said
+1 what @xuanruiqi said
+1 what @xuanruiqi says.
I also agree - it would be great to make this more available!
I agree with xuanruiqi. This package needs to be updated...
Given the current COVID-19 situation, I believe that effort should be made for this package to be moved into [community] ASAP, to make it much easier for Arch users to help with the effort to develop a medicine for the virus. This will be the right thing to do, IMO.
Can you add provides=('foldinghome')
to the PKGBUILD and remove the replaces
part? It is an alternative to the other methods of installing FAH, not a replacement or successor.
There is a software conflict between foldingathome and the opencl-amd program. When it is installed, the program no longer works.
foldingathome.service - Folding@home distributed computing client Loaded: loaded (/usr/lib/systemd/system/foldingathome.service; enabled; vendor preset: disabled) Active: failed (Result: core-dump) since Tue 2018-10-16 19:56:42 CEST; 5h 9min ago Main PID: 529 (code=dumped, signal=SEGV)
oct. 16 19:56:25 AMD systemd[1]: Started Folding@home distributed computing client. oct. 16 19:56:26 AMD FAHClient[529]: 17:56:26:INFO(1):Read GPUs.txt
oct. 16 19:56:35 AMD FAHClient[529]: amdgpu_device_initialize: DRM version is 2.50.0 but this driver is only compatible with 3.x.x. oct. 16 19:56:41 AMD FAHClient[529]: amdgpu_device_initialize: DRM version is 2.50.0 but this driver is only compatible with 3.x.x.
oct. 16 19:56:42 AMD systemd[1]: foldingathome.service: Main process exited, code=dumped, status=11/SEGV oct. 16 19:56:42 AMD systemd[1]: foldingathome.service: Failed with result 'core-dump'.
You may need to add vsyscall=emulate
to your kernel commandline if you get segfaults and messages like vsyscall attempted with vsyscall=none
. See https://bbs.archlinux.org/viewtopic.php?id=234707.
Pinned Comments
alucryd commented on 2020-04-21 14:23 (UTC) (edited on 2020-04-22 15:57 (UTC) by alucryd)
To be clear, nvidia users need to enable both
foldingathome.service
andfoldingathome-nvidia.service
.