AUR (en) - llama.cpp-sycl-f16

Search Criteria

Enter search criteria

Search by

Keywords

Out of Date

Sort by

Sort order

Per page

Package Details: llama.cpp-sycl-f16 b6039-1

Package Actions

Git Clone URL:	https://aur.archlinux.org/llama.cpp-sycl-f16.git (read-only, click to copy)
Package Base:	llama.cpp-sycl-f16
Description:	Port of Facebook's LLaMA model in C/C++ (with Intel SYCL GPU optimizations and F16)
Upstream URL:	https://github.com/ggml-org/llama.cpp
Licenses:	MIT
Conflicts:	libggml, llama.cpp
Provides:	llama.cpp
Submitter:	txtsd
Maintainer:	heyrict
Last Packager:	txtsd
Votes:	2
Popularity:	0.023375
First Submitted:	2024-10-26 18:11 (UTC)
Last Updated:	2025-07-30 22:52 (UTC)

Dependencies (11)

curl (curl-git^AUR, curl-c-ares^AUR)
gcc-libs (gcc-libs-git^AUR, gccrs-libs-git^AUR, gcc-libs-snapshot^AUR)
glibc (glibc-git^AUR, glibc-eac^AUR)
intel-oneapi-basekit (intel-oneapi-base-toolkit)
python (python37^AUR)
python-numpy (python-numpy-git^AUR, python-numpy1^AUR, python-numpy-mkl^AUR, python-numpy-mkl-bin^AUR, python-numpy-mkl-tbb^AUR)
python-sentencepiece^AUR (python-sentencepiece-git^AUR)
cmake (cmake3^AUR, cmake-git^AUR) (make)
git (git-git^AUR, git-gl^AUR) (make)
procps-ng (procps-ng-git^AUR) (make)
python-pytorch (python-pytorch-cxx11abi^AUR, python-pytorch-cxx11abi-opt^AUR, python-pytorch-cxx11abi-cuda^AUR, python-pytorch-cxx11abi-opt-cuda^AUR, python-pytorch-cxx11abi-rocm^AUR, python-pytorch-cxx11abi-opt-rocm^AUR, python-pytorch-cuda, python-pytorch-opt, python-pytorch-opt-cuda, python-pytorch-opt-rocm, python-pytorch-rocm) (optional)

Required by (0)

Sources (3)

Pinned Comments

txtsd commented on 2024-10-26 20:15 (UTC) (edited on 2024-12-06 14:15 (UTC) by txtsd)

Alternate versions

llama.cpp
llama.cpp-vulkan
llama.cpp-sycl-fp16
llama.cpp-sycl-fp32
llama.cpp-cuda
llama.cpp-cuda-f16
llama.cpp-hip

Latest Comments

1 2 3 Next › Last »

ioctl commented on 2025-09-05 07:55 (UTC) (edited on 2025-09-05 08:09 (UTC) by ioctl)

Build error (appear only if building after . /opt/intel/oneapi/setvars.sh command in the same terminal):

...

:: WARNING: setvars.sh has already been run. Skipping re-execution.
   To force a re-execution of setvars.sh, use the '--force' option.
   Using '--force' can result in excessive use of your environment variables.

usage: source setvars.sh [--force] [--config=file] [--help] [...]
  --force        Force setvars.sh to re-run, doing so may overload environment.
  --config=file  Customize env vars using a setvars.sh configuration file.
  --help         Display this help message and exit.
  ...            Additional args are passed to individual env/vars.sh scripts
                 and should follow this script's arguments.

  Some POSIX shells do not accept command-line options. In that case, you can pass
  command-line options via the SETVARS_ARGS environment variable. For example:

  $ SETVARS_ARGS="--config=config.txt" ; export SETVARS_ARGS
  $ . path/to/setvars.sh

  The SETVARS_ARGS environment variable is cleared on exiting setvars.sh.

The oneAPI toolkits no longer support 32-bit libraries, starting with the 2025.0 toolkit release. See the oneAPI release notes for more details.

...

ioctl commented on 2025-08-31 08:54 (UTC)

Seems this build cannot find any SYCL device or GPU:

$ llama-server -m gpt-oss-20b-mxfp4.gguf -ngl 99 --jinja
warning: no usable GPU found, --gpu-layers option will be ignored
warning: one possible reason is that llama.cpp was compiled without GPU support
warning: consult docs/build.md for compilation instructions
build: 6332 (bbbf5eccc) with Intel(R) oneAPI DPC++/C++ Compiler 2025.0.4 (2025.0.4.20241205) for x86_64-unknown-linux-gnu
system info: n_threads = 4, n_threads_batch = 4, total_threads = 12

system_info: n_threads = 4 (n_threads_batch = 4) / 12 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | OPENMP = 1 | REPACK = 1 | 

main: binding port with default address family
main: HTTP server is listening, hostname: 127.0.0.1, port: 8080, http threads: 11
main: loading model

...

At that, /opt/intel/oneapi/2025.0/bin/sycl-ls sees 4 SYCL devices including level_zero:gpu and opencl:gpu .

txtsd commented on 2025-06-16 08:31 (UTC)

!buildflags did the trick! Thanks @heyrict!

heyrict commented on 2025-06-16 08:22 (UTC)

@txtsd Thanks for having me as the maintainer and sorry for the delay of response as I have been busy preparing for graduation recently.

I pushed an update to this package based on my previous investigation and the makepkg guidelines. Could you please check if it compiles for you?

txtsd commented on 2025-06-16 06:31 (UTC)

I'm still unable to get this to build with @heyrict's changes. Anyone have a working PKGBUILD?

txtsd commented on 2025-05-25 05:17 (UTC)

I'm on vacation until next month. I'll add @heyrict as co-maintainer to adjust this package until then.

bionade24 commented on 2025-05-14 14:29 (UTC)

@txtsd: May you please implement the build fixes provided by @heyrict or disown this pkg? It's now not building in a clean build environment for months straight. Offering this PKGBUILD on the AUR is just misleading.

heyrict commented on 2025-03-16 02:48 (UTC) (edited on 2025-03-20 07:47 (UTC) by heyrict)

I am able to build the package with manual modifications. Steps posted below:

Fetch resources and extract the files: makepkg -o
Manually build the package:

cd src
source /opt/intel/oneapi/setvars.sh
local _cmake_options=(
    -B build
    -S llama.cpp
    #-DCMAKE_BUILD_TYPE=None
    -DCMAKE_INSTALL_PREFIX='/usr'
    -DGGML_ALL_WARNINGS=OFF
    -DGGML_ALL_WARNINGS_3RD_PARTY=OFF
    #-DBUILD_SHARED_LIBS=OFF
    #-DGGML_STATIC=ON
    #-DGGML_LTO=ON
    -DGGML_RPC=ON
    -DLLAMA_CURL=ON
    -DGGML_BLAS=ON
    -DCMAKE_C_COMPILER=icx
    -DCMAKE_CXX_COMPILER=icpx
    -DGGML_SYCL=ON
    -DGGML_SYCL_F16=ON # Comment this out for building F32 version
    -Wno-dev
)
cmake "${_cmake_options[@]}"
cmake --build build --config Release -j -v

Patch the PKGBUILD, as there are no lib*.a in a non-static build.

diff --git a/PKGBUILD b/PKGBUILD
index 7999ec8..51cc1bc 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -73,7 +73,7 @@ build() {
 package() {
   DESTDIR="${pkgdir}" cmake --install build
   rm "${pkgdir}/usr/include/"ggml*
-  rm "${pkgdir}/usr/lib/"lib*.a
+  #rm "${pkgdir}/usr/lib/"lib*.a

   install -Dm644 "${_pkgname}/LICENSE" "${pkgdir}/usr/share/licenses/${pkgname}/LICENSE"

Package the build and install: makepkg -Ri

Notes:

BUILD_SHARED_LIBS and GGML_STATIC were commented out because of the inference issue. The built package will have libggml.so shared libraries which may have conflicts with other packages.
Switching GGML LTO off will fix build error libggml-sycl.a: error adding symbols: archive has no index; run ranlib to add one. I think it is some limitations of icx/icpx.
I've noticed some differences between build/compile_commands.json in manual build and pacman build, with the same build flags. Probably related to the SYCL_EXTERNAL issue.

An example of differences in compile_commands.json

--- /tmp/pacman.json    2025-03-16 10:41:15.426354158 +0800
+++ /tmp/manual.json    2025-03-16 10:41:06.154835418 +0800
@@ -1,31 +1,47 @@
 {
   "directory": "/home/heyrict/.cache/paru/clone/llama.cpp-sycl-f32/src/build/ggml/src",
   "command": "/opt/intel/oneapi/compiler/2025.0/bin/icx
  -DGGML_BUILD
  -DGGML_SCHED_MAX_COPIES=4
  -DGGML_SHARED
  -D_GNU_SOURCE
  -D_XOPEN_SOURCE=600
  -Dggml_base_EXPORTS
  -I/home/heyrict/.cache/paru/clone/llama.cpp-sycl-f32/src/llama.cpp/ggml/src/.
  -I/home/heyrict/.cache/paru/clone/llama.cpp-sycl-f32/src/llama.cpp/ggml/src/../include
+ -march=x86-64
+ -mtune=generic
+ -O2
+ -pipe
+ -fno-plt
+ -fexceptions
+ -Wp,-D_FORTIFY_SOURCE=3
+ -Wformat
+ -Werror=format-security
+ -fstack-clash-protection
+ -fcf-protection
+ -fno-omit-frame-pointer
+ -mno-omit-leaf-frame-pointer
+ -g
+ -ffile-prefix-map=/home/heyrict/.cache/paru/clone/llama.cpp-sycl-f32/src=/usr/src/debug/llama.cpp-sycl-f32
+ -flto=auto
  -O3
  -DNDEBUG
  -std=gnu11
  -fPIC
  -Wshadow
  -Wstrict-prototypes
  -Wpointer-arith
  -Wmissing-prototypes
  -Werror=implicit-int
  -Werror=implicit-function-declaration
  -Wall
  -Wextra
  -Wpedantic
  -Wcast-qual
  -Wno-unused-function
  -o CMakeFiles/ggml-base.dir/ggml.c.o
  -c /home/heyrict/.cache/paru/clone/llama.cpp-sycl-f32/src/llama.cpp/ggml/src/ggml.c",
   "file": "/home/heyrict/.cache/paru/clone/llama.cpp-sycl-f32/src/llama.cpp/ggml/src/ggml.c",
   "output": "ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o"
 },