Package Details: python-vllm-rocm 0.21.0-1

Git Clone URL: https://aur.archlinux.org/python-vllm-rocm.git (read-only, click to copy)
Package Base: python-vllm-rocm
Description: high-throughput and memory-efficient inference and serving engine for LLMs (ROCm support)
Upstream URL: https://github.com/vllm-project/vllm
Licenses: Apache-2.0
Provides: python-vllm
Submitter: davispuh
Maintainer: davispuh
Last Packager: davispuh
Votes: 3
Popularity: 0.88
First Submitted: 2026-02-24 22:16 (UTC)
Last Updated: 2026-05-23 21:08 (UTC)

Latest Comments

« First ‹ Previous 1 2

cmhacks commented on 2026-03-26 10:52 (UTC)

Hi, thanks for maintaining this package!

Currently, PYTORCH_ROCM_ARCH is set to compile for all 8 GPU architectures (gfx906, gfx908, gfx90a, gfx942, gfx1100, gfx1101, gfx1200, gfx1201), which results in extremely long build times — often over an hour — since every HIP kernel is compiled once per target.

Most users only have one GPU and only need a single architecture. Would it be possible to either:

Split into per-architecture packages (e.g., python-vllm-rocm-gfx906, python-vllm-rocm-gfx1201, etc.) so users can install only the one matching their hardware, or Auto-detect the system GPU at build time using rocminfo or AMDGPU_TARGETS to compile only for the installed hardware?

This would drastically reduce build times (up to ~8x faster) and resource usage for end users.

Thanks for considering this!