Search Criteria
Package Details: llama.cpp-cuda b9279-1
Package Actions
| Git Clone URL: | https://aur.archlinux.org/llama.cpp-cuda.git (read-only, click to copy) |
|---|---|
| Package Base: | llama.cpp-cuda |
| Description: | Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations) |
| Upstream URL: | https://github.com/ggml-org/llama.cpp |
| Licenses: | MIT |
| Conflicts: | ggml, libggml, llama.cpp |
| Provides: | llama.cpp |
| Submitter: | txtsd |
| Maintainer: | fabse |
| Last Packager: | fabse |
| Votes: | 17 |
| Popularity: | 1.88 |
| First Submitted: | 2024-10-26 20:17 (UTC) |
| Last Updated: | 2026-05-22 06:00 (UTC) |
Dependencies (18)
- cuda (cuda11.1AUR, cuda-12.2AUR, cuda12.0AUR, cuda11.4AUR, cuda-12.5AUR, cuda-12.9AUR, cuda-12.8AUR, cuda-pascalAUR)
- curl (curl-gitAUR, curl-c-aresAUR)
- gcc-libs (gcc-libs-gitAUR, gccrs-libs-gitAUR, gcc-libs-snapshotAUR)
- glibc (glibc-gitAUR, glibc-eacAUR, glibc-git-native-pgoAUR)
- nvidia-utils (nvidia-410xx-utilsAUR, nvidia-440xx-utilsAUR, nvidia-430xx-utilsAUR, nvidia-340xx-utilsAUR, nvidia-510xx-utilsAUR, nvidia-utils-teslaAUR, nvidia-525xx-utilsAUR, nvidia-575xx-utilsAUR, nvidia-340xx-utils-macbookAUR, nvidia-535xx-utilsAUR, nvidia-utils-betaAUR, nvidia-470xx-utilsAUR, nvidia-390xx-utilsAUR, nvidia-550xx-utilsAUR, nvidia-vulkan-utilsAUR, nvidia-580xx-utilsAUR)
- python
- cmake (cmake3AUR, cmake-gitAUR) (make)
- cudnn (cudnn9.10-cuda12.9AUR, cudnn-pascalAUR) (make)
- git (git-gitAUR, git-glAUR, git-wd40AUR) (make)
- ninja (ninja-gitAUR, ninja-memAUR, ninja-noemacs-gitAUR, ninja-kitwareAUR, ninja-fuchsia-gitAUR, n2-ninja-symlinkAUR) (make)
- shaderc (shaderc-gitAUR, shaderc-gitAUR) (make)
- nccl (nccl-cuda12.9AUR, nccl-gitAUR) (optional) – needed for multi-GPU parallelism
- python-ggufAUR (python-gguf-gitAUR) (optional) – needed for convert_hf_to_gguf.py
- python-numpy (python-numpy-gitAUR, python-numpy-mkl-binAUR, python-numpy1AUR, python-numpy-mkl-tbbAUR, python-numpy-mklAUR) (optional) – needed for convert_hf_to_gguf.py
- python-pytorch (python-pytorch-cuda12.9AUR, python-pytorch-opt-cuda12.9AUR, python-pytorch-cuda, python-pytorch-opt, python-pytorch-opt-cuda, python-pytorch-opt-rocm, python-pytorch-rocm) (optional) – needed for convert_hf_to_gguf.py
- python-safetensors (optional) – needed for convert_hf_to_gguf.py
- python-sentencepieceAUR (python-sentencepiece-gitAUR, python-sentencepiece-binAUR) (optional) – needed for convert_hf_to_gguf.py
- python-transformersAUR (python-transformers-gitAUR) (optional) – needed for convert_hf_to_gguf.py
Required by (5)
- llamaman-bin (requires llama.cpp) (optional)
- scmd-bin (requires llama.cpp)
- voxd (requires llama.cpp) (optional)
- voxd-bin (requires llama.cpp) (optional)
- voxd-git (requires llama.cpp) (optional)
Sources (3)
txtsd commented on 2024-12-06 14:15 (UTC)
txtsd commented on 2024-12-06 13:37 (UTC)
@v1993 I've uploaded llama.cpp-cuda-f16. Please let me know if it works as expected!
txtsd commented on 2024-12-02 02:25 (UTC)
I'll give it a look later today and see if a newer package is warranted in that case. Thanks for your input!
v1993 commented on 2024-12-01 14:53 (UTC)
To be honest, I'm not 100% sure (it's a pretty old option and tacking down its origins is kinda tricky), but I'd expect at least a performance degradation on older GPUs (Nvidia used to be really bad at fp16 on older architectures).
txtsd commented on 2024-12-01 14:38 (UTC)
@v1993 Does that have to be a separate package, or will making the change in this package suffice without breaking things for users of older GPUs?
v1993 commented on 2024-12-01 14:29 (UTC)
Would it be possible to have a package version with GGML_CUDA_F16 enabled? It's a nice performance boost on newer GPUs. Thank you for your work on this package!
Poscat commented on 2024-11-28 09:46 (UTC)
@txtsd thank you
txtsd commented on 2024-11-25 07:05 (UTC)
Builds are not static anymore, and the service file has been fixed.
txtsd commented on 2024-11-24 03:16 (UTC)
@Poscat Thank you for your input! The service file was inherited from a previous version and maintainer of the package. I admit that the service was not tested.
The static builds were created to allow for side-by-side installation with whisper.cpp, since they both install libggml files.
Poscat commented on 2024-11-24 03:12 (UTC)
diff --git a/llama.cpp.service b/llama.cpp.service
index 4678d85..be89f9b 100644
--- a/llama.cpp.service
+++ b/llama.cpp.service
@@ -7,7 +7,7 @@ Type=simple
EnvironmentFile=/etc/conf.d/llama.cpp
ExecStart=/usr/bin/llama-server $LLAMA_ARGS
ExecReload=/bin/kill -s HUP $MAINPID
-Restart=never
+Restart=no
[Install]
WantedBy=multi-user.target
Also your sysetmd service file is wrong. Did you even test your package?
Pinned Comments
txtsd commented on 2024-10-26 20:17 (UTC) (edited on 2024-12-06 14:15 (UTC) by txtsd)
Alternate versions
llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip