Hi all, thanks for this package! Wondering if we can open it up to other build end vars.
I would like to build the web ui via LLAMA_BUILD_SERVER=1 - I know I can clone and change, lazy question :)
Git Clone URL: | https://aur.archlinux.org/llama.cpp-cuda.git (read-only, click to copy) |
---|---|
Package Base: | llama.cpp-cuda |
Description: | Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations) |
Upstream URL: | https://github.com/ggerganov/llama.cpp |
Licenses: | MIT |
Conflicts: | llama.cpp |
Provides: | llama.cpp |
Submitter: | txtsd |
Maintainer: | txtsd |
Last Packager: | txtsd |
Votes: | 8 |
Popularity: | 1.76 |
First Submitted: | 2024-10-26 20:17 (UTC) |
Last Updated: | 2025-07-05 08:38 (UTC) |
Hi all, thanks for this package! Wondering if we can open it up to other build end vars.
I would like to build the web ui via LLAMA_BUILD_SERVER=1 - I know I can clone and change, lazy question :)
This package now uses system libggml so it should work alongside whisper.cpp
Tests and examples building has been turned off.
kompute is removed.
Why are you using Kompute, particularly from the fork (https://github.com/nomic-ai/kompute.git - fork insted of mainline https://github.com/KomputeProject/kompute repo), specifically for llama.cpp CUDA version? I think it would be more convenient to create a separate package for the Kompute backend and have llama.cpp-cuda depend only on CUDA-related dependencies? without Kompute backend dependencies.
llama.cpp-cuda: /usr/lib/libggml-base.so exists in the file system (owned by whisper.cpp-cuda)
llama.cpp-cuda: /usr/lib/libggml-cpu.so exists in the file system (owned by whisper.cpp-cuda)
llama.cpp-cuda: /usr/lib/libggml-cuda.so exists in the file system (owned by whisper.cpp-cuda)
llama.cpp-cuda: /usr/lib/libggml.so exists in the file system (owned by whisper.cpp-cuda)
An error occurred, and no packages were updated.
-> Error during installation: [/home/chi/.cache/yay/llama.cpp-cuda/llama.cpp-cuda-b4762-1-x86_64.pkg.tar.zst] - exit status 1
I recommend include the model template files into the package
https://github.com/ggml-org/llama.cpp/tree/master/models/templates
so we can choose the model template file directly, no need to download these again
You should export CUDA_PATH and NVCC_CCBIN.
Check /etc/profile.d/cuda.sh
To get this to pass cmake I had to edit the PKGBUILD and add cmake options:
-DCMAKE_CUDA_COMPILER=/opt/cuda/bin/nvcc
-DCMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-13
I tried pointing it to NVCC via environment variables but it ended up using the wrong GCC version if I did that, which caused compiler errors in CMakeDetermineCompilerId.cmake:865
.
@txtsd, setting CMAKE_CUDA_ARCHITECTURES to my hardware number fixes this problem.
This error appears on the build stage, so it can be reproduced without video card.
@ioctl Sorry, I don't have the necessary hardware to test. Does not setting CMAKE_CUDA_ARCHITECTURES
make it work correctly?
Pinned Comments
txtsd commented on 2024-10-26 20:17 (UTC) (edited on 2024-12-06 14:15 (UTC) by txtsd)
Alternate versions
llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip