Search Criteria
Package Details: python-vllm-rocm 0.20.0-1
Package Actions
| Git Clone URL: | https://aur.archlinux.org/python-vllm-rocm.git (read-only, click to copy) |
|---|---|
| Package Base: | python-vllm-rocm |
| Description: | high-throughput and memory-efficient inference and serving engine for LLMs (ROCm support) |
| Upstream URL: | https://github.com/vllm-project/vllm |
| Licenses: | Apache-2.0 |
| Submitter: | davispuh |
| Maintainer: | davispuh |
| Last Packager: | davispuh |
| Votes: | 3 |
| Popularity: | 0.96 |
| First Submitted: | 2026-02-24 22:16 (UTC) |
| Last Updated: | 2026-04-30 00:06 (UTC) |
Dependencies (57)
- amdsmi (opencl-amdAUR, rocm-binAUR, rocm-gfx101x-binAUR, rocm-gfx103x-binAUR, rocm-gfx110x-binAUR, rocm-gfx120x-binAUR, rocm-gfx1150-binAUR, rocm-gfx1151-binAUR, rocm-gfx1152-binAUR, rocm-nightly-gfx120x-all-binAUR, rocm-nightly-gfx1151-binAUR, rocm-nightly-gfx110x-binAUR)
- numactl (numactl-gitAUR)
- python-aiohttp
- python-blake3AUR
- python-cachetools
- python-cbor2
- python-cloudpickle
- python-diskcacheAUR
- python-einopsAUR
- python-fastapi
- python-ggufAUR (python-gguf-gitAUR)
- python-huggingface-hub (python-huggingface-hub-gitAUR)
- python-ijson
- python-importlib-metadata
- python-mistral-commonAUR (python-mistral-common-gitAUR)
- python-msgspec
- python-openai
- python-opencv (python-opencv-cuda)
- python-partial-json-parserAUR (python-partial-json-parser-gitAUR)
- python-prometheus-fastapi-instrumentatorAUR
- Show 37 more dependencies...
Required by (1)
- python-vllm-omni (optional)
Latest Comments
davispuh commented on 2026-04-30 00:10 (UTC)
@Orion-zhen I don't think that's good idea. All ROCM packages will have such issue and it's different for every shell. That would only work for bash. It's better for your build script to do that.
@chiz that happens because
intel-oneapi-mklwas updated so you also need to updatepython-pytorch-opt-rocm(or rebuild your PyTorch).chiz commented on 2026-04-26 16:38 (UTC)
there is a error:
Orion-zhen commented on 2026-04-06 13:21 (UTC) (edited on 2026-04-06 13:24 (UTC) by Orion-zhen)
Please add this in
build()in case that a newly installed rocm can't be recognized:I failed to build this package in GitHub Action runner.
davispuh commented on 2026-04-05 15:41 (UTC)
python-cbor2is indeed required, added it as dependency.But others I don't have them installed and
Qwen3.5-4Bworks fine without any issues. I guess it depends on what model you want to run.Also I tested that
python-compressed-tensorsbuilds and installs with--nocheck(only check step fails for me). It's needed forgpt-oss-20bbut I couldn't get it to run on my RX 7900 XTX (No MXFP4 MoE backend supports the deployment configuration).bnjbvr commented on 2026-03-30 20:47 (UTC) (edited on 2026-03-30 20:48 (UTC) by bnjbvr)
Fwiw, I had to also install the following packages, so as to try to make it work:
python-cbor2python-openai-harmonypython-model-hosting-container-standardspython-jmespathpython-compressed-tensorsOtherwise, the command
vllm serve …wouldn't work. That being said,python-compressed-tensorsdidn't seem to install successfully, and the package is marked as orphaned, so I eventually had to drop the whole thing, unfortunately.davispuh commented on 2026-03-26 20:03 (UTC) (edited on 2026-03-26 20:05 (UTC) by davispuh)
Did LLM wrote that...?
Anyway I fixed it a bit and pushed updated version.
The main issue was that
rocminfohas another place where it showsgfxso for me it wasgfx1100;gfx11because of:This updated/fixed version handles that correctly and:
You can specify your GPUs in either
PYTORCH_ROCM_ARCHorROCM_ARCHYou can not specify anything and auto-detection will build for your GPUs
You can specify empty
ROCM_ARCH=to build for all ROCm architectures/all GPUscmhacks commented on 2026-03-26 13:54 (UTC) (edited on 2026-03-26 13:58 (UTC) by cmhacks)
Hi davispuh, thank you so much for your quick response and for being open to this! Your willingness to collaborate and improve the package is truly appreciated — maintaining AUR packages takes a lot of effort and dedication, and it doesn't go unnoticed.
Here's a patch that implements both the ROCM_ARCH environment variable support and auto-detection as you suggested:
What it does:
ROCM_ARCHenvironment variable is set, it uses that value directly (e.g.,ROCM_ARCH="gfx1201" makepkg -si).ROCM_ARCHis not set, it auto-detects the installed GPU architectures usingrocminfoand builds only for those.This is fully backward-compatible. Existing users who don't set ROCM_ARCH and don't have rocminfo available at build time will get the same behavior as before. For everyone else, it can reduce build times by up to 8x.
Thank you again for all the time and effort you put into maintaining this package — it makes ROCm + vLLM accessible to the entire Arch community. Looking forward to your feedback!
davispuh commented on 2026-03-26 12:29 (UTC) (edited on 2026-03-26 12:30 (UTC) by davispuh)
Right now you can simply edit PKGBUILD and set PYTORCH_ROCM_ARCH for your GPUs.
Also for me I use ccache and with 24-core CPU it builds quite fast.
But yes, I think it could be implemented that if ROCM_ARCH env is set then use it for PYTORCH_ROCM_ARCH.
And if it's not set then yeah auto detection could be nice.
But my TODO list is already so long that I won't have time to work on this in nearest future so if you send a PKGBUILD patch I can apply that.
cmhacks commented on 2026-03-26 10:52 (UTC)
Hi, thanks for maintaining this package!
Currently, PYTORCH_ROCM_ARCH is set to compile for all 8 GPU architectures (gfx906, gfx908, gfx90a, gfx942, gfx1100, gfx1101, gfx1200, gfx1201), which results in extremely long build times — often over an hour — since every HIP kernel is compiled once per target.
Most users only have one GPU and only need a single architecture. Would it be possible to either:
Split into per-architecture packages (e.g., python-vllm-rocm-gfx906, python-vllm-rocm-gfx1201, etc.) so users can install only the one matching their hardware, or Auto-detect the system GPU at build time using rocminfo or AMDGPU_TARGETS to compile only for the installed hardware?
This would drastically reduce build times (up to ~8x faster) and resource usage for end users.
Thanks for considering this!