summarylogtreecommitdiffstats
path: root/sglang@.service
AgeCommit message (Collapse)Author
4 daysAdd qwen3.6_27b_text_nvfp4 config + HF-cache fixup mechanismWill Handley
The text-only NVFP4 sibling of Qwen3.6-27B-VL ships without preprocessor_config.json, which sglang's loader still expects. Add a generic, idempotent cache-fixup helper invoked via ExecStartPre that injects missing files from /usr/share/sglang/cache-fixups/ into the matching HF Hub snapshot dirs, and ship the preprocessor_config.json needed by Qwen3.6-27B-Text-NVFP4-MTP as the first consumer. Also bumps pkgver to r12460.b6b9145c9.
2026-05-07sglang-git: bump to r12421.8ee8a8f92, force gcc-15 host compiler in unitWill Handley
Adds Environment=NVCC_PREPEND_FLAGS=--compiler-bindir=/usr/bin/gcc-15 to sglang@.service so flashinfer / sgl_kernel JIT extensions stop failing to build under Arch's gcc 16 default (nvcc 13.x can't parse libstdc++ 16). Lives in the unit (not in backup= conf) so pacman freely updates without requiring a .pacnew merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-14Bake --sleep-on-idle into service fileWill Handley
Prevents scheduler busy-wait burning a CPU core at idle. Also updates gemma configs to use gemma4 parser and bumps pkgver. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07Template service, per-model configs, sleep-on-idle defaultWill Handley
- Replace sglang.service with sglang@.service template unit - Add per-model config files for Gemma 4 and Qwen 3.5 variants - Default to --sleep-on-idle to reduce CPU usage when idle - Update sglang.conf as global config with SGLANG_OPTS/SGLANG_ARGS split - Point source to JustinTong0323/sglang new-model-gg branch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>