Package Details: llama.cpp-cuda-f16 b5753-1

Git Clone URL: https://aur.archlinux.org/llama.cpp-cuda-f16.git (read-only, click to copy)
Package Base: llama.cpp-cuda-f16
Description: Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations and F16)
Upstream URL: https://github.com/ggerganov/llama.cpp
Licenses: MIT
Conflicts: llama.cpp
Provides: llama.cpp
Submitter: txtsd
Maintainer: txtsd
Last Packager: txtsd
Votes: 3
Popularity: 0.136982
First Submitted: 2024-12-06 13:35 (UTC)
Last Updated: 2025-06-24 19:13 (UTC)

Pinned Comments

txtsd commented on 2024-12-06 14:15 (UTC)

Alternate versions

llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip

Latest Comments

txtsd commented on 2025-06-16 11:54 (UTC)

This package now uses system libggml so it should work alongside whisper.cpp

Tests and examples building has been turned off.
kompute is removed.

JamesMowery commented on 2025-06-12 22:49 (UTC) (edited on 2025-06-12 22:51 (UTC) by JamesMowery)

Fails to build for me unfortunately. Nvidia RTX 4090:

-- Found CUDAToolkit: /opt/cuda/targets/x86_64-linux/include (found version "12.8.93")
-- CUDA Toolkit found
-- Using CUDA architectures: native
CMake Error at /usr/share/cmake/Modules/CMakeDetermineCompilerId.cmake:909 (message):
  Compiling the CUDA compiler identification source file
  "CMakeCUDACompilerId.cu" failed.

  Compiler: /opt/cuda/bin/nvcc

  Build flags:

  Id flags: --keep;--keep-dir;tmp -v

  The output was:

  2

  nvcc warning : Support for offline compilation for architectures prior to
  '<compute/sm/lto>_75' will be removed in a future release (Use
  -Wno-deprecated-gpu-targets to suppress warning).

  #$ _NVVM_BRANCH_=nvvm

  #$ _SPACE_=

  #$ _CUDART_=cudart

  #$ _HERE_=/opt/cuda/bin

  #$ _THERE_=/opt/cuda/bin

  #$ _TARGET_SIZE_=

  #$ _TARGET_DIR_=

  #$ _TARGET_DIR_=targets/x86_64-linux

  #$ TOP=/opt/cuda/bin/..

  #$ CICC_PATH=/opt/cuda/bin/../nvvm/bin

  #$ NVVMIR_LIBRARY_DIR=/opt/cuda/bin/../nvvm/libdevice

  #$ LD_LIBRARY_PATH=/opt/cuda/bin/../lib:/opt/cuda/lib64:/opt/cuda/lib64

  #$
  PATH=/opt/cuda/bin/../nvvm/bin:/opt/cuda/bin:/opt/cuda/bin:/opt/miniconda3/bin:/opt/cuda/bin:/home/james/miniforge3/condabin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/james/.lmstudio/bin:/home/james/.lmstudio/bin

  #$ INCLUDES="-I/opt/cuda/bin/../targets/x86_64-linux/include"

  #$ LIBRARIES= "-L/opt/cuda/bin/../targets/x86_64-linux/lib/stubs"
  "-L/opt/cuda/bin/../targets/x86_64-linux/lib"

  #$ CUDAFE_FLAGS=

  #$ PTXAS_FLAGS=

  #$ rm tmp/a_dlink.reg.c

  #$ gcc -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -E -x c++ -D__CUDACC__
  -D__NVCC__ "-I/opt/cuda/bin/../targets/x86_64-linux/include"
  -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=8
  -D__CUDACC_VER_BUILD__=93 -D__CUDA_API_VER_MAJOR__=12
  -D__CUDA_API_VER_MINOR__=8 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1
  -D__CUDACC_DEVICE_ATOMIC_BUILTINS__=1 -include "cuda_runtime.h" -m64
  "CMakeCUDACompilerId.cu" -o "tmp/CMakeCUDACompilerId.cpp4.ii"

...
...
...

  /usr/include/c++/15.1.1/type_traits(3521): error: type name is not allowed

      inline constexpr bool is_volatile_v = __is_volatile(_Tp);
                                                          ^

  /usr/include/c++/15.1.1/type_traits(3663): error: type name is not allowed

      inline constexpr size_t rank_v = __array_rank(_Tp);
                                                    ^

  /usr/include/c++/15.1.1/bits/stl_algobase.h(1239): error: type name is not
  allowed

        || __is_pointer(_ValueType1)
                        ^

  /usr/include/c++/15.1.1/bits/stl_algobase.h(1412): error: type name is not
  allowed

      && __is_pointer(_II1) && __is_pointer(_II2)
                      ^

  /usr/include/c++/15.1.1/bits/stl_algobase.h(1412): error: type name is not
  allowed

      && __is_pointer(_II1) && __is_pointer(_II2)
                                            ^
  25 errors detected in the compilation of "CMakeCUDACompilerId.cu".

  # --error 0x2 --

Call Stack (most recent call first):
  /usr/share/cmake/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
  /usr/share/cmake/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
  /usr/share/cmake/Modules/CMakeDetermineCUDACompiler.cmake:139 (CMAKE_DETERMINE_COMPILER_ID)
  ggml/src/ggml-cuda/CMakeLists.txt:43 (enable_language)

-- Configuring incomplete, errors occurred!
==> ERROR: A failure occurred in build().
    Aborting...
error: failed to build 'llama.cpp-cuda-f16-b5648-1': 
error: packages failed to build: llama.cpp-cuda-f16-b5648-1

AI says this is the reason (not sure if true or not):

The error occurs because your CUDA toolkit version (12.8.93) is incompatible with your GCC version (15.1.1). CUDA has strict requirements for supported host compiler versions, and GCC 15 is too new for CUDA 12.x.

v1993 commented on 2024-12-27 22:05 (UTC)

Thank you for making this one! It's nice to have.

txtsd commented on 2024-12-06 14:15 (UTC)

Alternate versions

llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip