Package Details: python-transformers 5.7.0-1

Git Clone URL: https://aur.archlinux.org/python-transformers.git (read-only, click to copy)
Package Base: python-transformers
Description: State-of-the-art pretrained models for inference and training
Upstream URL: https://github.com/huggingface/transformers
Keywords: huggingface transformers
Licenses: Apache-2.0
Submitter: filipg
Maintainer: daskol
Last Packager: daskol
Votes: 18
Popularity: 1.21
First Submitted: 2021-10-23 09:30 (UTC)
Last Updated: 2026-04-28 19:28 (UTC)

Sources (1)

Latest Comments

1 2 3 4 5 6 7 Next › Last »

davispuh commented on 2026-05-06 19:03 (UTC)

Thanks @lalala_233 that works. But it's crazy that minor version bump (0.23.0 => 0.23.1) is so incompatible...

TheGreatAndyChow commented on 2026-05-02 14:10 (UTC)

This tokenizer/transformers issue has been going on for years now. This was built 2026-04-28 19:28 (UTC) and python-tokenizers at 2026-04-28 21:24 (UTC). So we basically had 2 hours where the install could work, and anyone updating after that date is unable to update for 4 days now. I get that everyone is busy and this is free work, and I appreciate all that, but this is not normal. There should be some coordination between those packages. It's a known issue that pops-up every few months and not normal.

lalala_233 commented on 2026-05-02 07:05 (UTC) (edited on 2026-05-02 07:06 (UTC) by lalala_233)

I have met maximun characters limit, so you'd better read these two comment

Edit PKGBUILD

source=(
  "python-transformers-$pkgver.tar.gz"::"https://github.com/huggingface/transformers/archive/refs/tags/v$pkgver.tar.gz"
  "remove-tokenizers-upper-bound.patch"
  "rename-arguments.patch"
)
sha256sums=('39c29ea1a0533c8667106cb005064c64ab2fcd95fb91ccb95922a032da1de395' 'SKIP' 'SKIP')

add these to rename-arguments.patch

diff --git a/src/transformers/convert_slow_tokenizer.py b/src/transformers/convert_slow_tokenizer.py
index 1d96d1c..43934ca 100644
--- a/src/transformers/convert_slow_tokenizer.py
+++ b/src/transformers/convert_slow_tokenizer.py
@@ -482,7 +482,7 @@ class HerbertConverter(Converter):
         tokenizer.decoder = decoders.BPEDecoder(suffix=token_suffix)
         tokenizer.post_processor = processors.BertProcessing(
             sep=(self.original_tokenizer.sep_token, self.original_tokenizer.sep_token_id),
-            cls=(self.original_tokenizer.cls_token, self.original_tokenizer.cls_token_id),
+            cls_token=(self.original_tokenizer.cls_token, self.original_tokenizer.cls_token_id),
         )

         return tokenizer
@@ -553,7 +553,7 @@ class RobertaConverter(Converter):
         tokenizer.decoder = decoders.ByteLevel()
         tokenizer.post_processor = processors.RobertaProcessing(
             sep=(ot.sep_token, ot.sep_token_id),
-            cls=(ot.cls_token, ot.cls_token_id),
+            cls_token=(ot.cls_token, ot.cls_token_id),
             add_prefix_space=ot.add_prefix_space,
             trim_offsets=True,  # True by default on Roberta (historical)
         )
@@ -1455,7 +1455,7 @@ class CLIPConverter(Converter):
         # Hack to have a ByteLevel and TemplateProcessor
         tokenizer.post_processor = processors.RobertaProcessing(
             sep=(self.original_tokenizer.eos_token, self.original_tokenizer.eos_token_id),
-            cls=(self.original_tokenizer.bos_token, self.original_tokenizer.bos_token_id),
+            cls_token=(self.original_tokenizer.bos_token, self.original_tokenizer.bos_token_id),
             add_prefix_space=False,
             trim_offsets=False,
         )
diff --git a/src/transformers/models/clip/tokenization_clip.py b/src/transformers/models/clip/tokenization_clip.py
index 018c630..739bc22 100644
--- a/src/transformers/models/clip/tokenization_clip.py
+++ b/src/transformers/models/clip/tokenization_clip.py
@@ -116,7 +116,7 @@ class CLIPTokenizer(TokenizersBackend):

         self._tokenizer.post_processor = processors.RobertaProcessing(
             sep=(str(eos_token), self.eos_token_id),
-            cls=(str(bos_token), self.bos_token_id),
+            cls_token=(str(bos_token), self.bos_token_id),
             add_prefix_space=False,
             trim_offsets=False,
         )
diff --git a/src/transformers/models/herbert/tokenization_herbert.py b/src/transformers/models/herbert/tokenization_herbert.py
index eb05431..2e5bfa2 100644
--- a/src/transformers/models/herbert/tokenization_herbert.py
+++ b/src/transformers/models/herbert/tokenization_herbert.py
@@ -104,7 +104,7 @@ class HerbertTokenizer(TokenizersBackend):

         self._tokenizer.post_processor = processors.BertProcessing(
             sep=(self.sep_token, 2),
-            cls=(self.cls_token, 0),
+            cls_token=(self.cls_token, 0),
         )


diff --git a/src/transformers/models/layoutlmv3/tokenization_layoutlmv3.py b/src/transformers/models/layoutlmv3/tokenization_layoutlmv3.py
index cda7c0b..652db6c 100644
--- a/src/transformers/models/layoutlmv3/tokenization_layoutlmv3.py
+++ b/src/transformers/models/layoutlmv3/tokenization_layoutlmv3.py
@@ -227,7 +227,7 @@ class LayoutLMv3Tokenizer(TokenizersBackend):

         self._tokenizer.post_processor = processors.RobertaProcessing(
             sep=(sep, sep_token_id),
-            cls=(cls, cls_token_id),
+            cls_token=(cls, cls_token_id),
             add_prefix_space=add_prefix_space,
             trim_offsets=True,
         )
diff --git a/src/transformers/models/roberta/tokenization_roberta.py b/src/transformers/models/roberta/tokenization_roberta.py
index 40b4e78..ccd699f 100644
--- a/src/transformers/models/roberta/tokenization_roberta.py
+++ b/src/transformers/models/roberta/tokenization_roberta.py
@@ -169,7 +169,7 @@ class RobertaTokenizer(TokenizersBackend):
         )
         self._tokenizer.post_processor = processors.RobertaProcessing(
             sep=(str(sep_token), self.sep_token_id),
-            cls=(str(cls_token), self.cls_token_id),
+            cls_token=(str(cls_token), self.cls_token_id),
             add_prefix_space=add_prefix_space,
             trim_offsets=trim_offsets,
         )

I'm still waiting for their update. They said they will support tokenizer 0.23.1 in the next version.

lalala_233 commented on 2026-05-01 12:20 (UTC) (edited on 2026-05-02 07:06 (UTC) by lalala_233)

Add these line to PKGBUILD

prepare() {
  cd "transformers-$pkgver"
  patch -Np1 -i "${srcdir}/remove-tokenizers-upper-bound.patch"
  patch -Np1 -i "${srcdir}/rename-arguments.patch"
}

and these to remove-tokenizers-upper-bound.patch

diff --git a/src/transformers/dependency_versions_table.py b/src/transformers/dependency_versions_table.py
index 1a721ca..f4ca49d 100644
--- a/src/transformers/dependency_versions_table.py
+++ b/src/transformers/dependency_versions_table.py
@@ -74,7 +74,7 @@ deps = {
     "tomli": "tomli",
     "tiktoken": "tiktoken",
     "timm": "timm>=1.0.23",
-    "tokenizers": "tokenizers>=0.22.0,<=0.23.0",
+    "tokenizers": "tokenizers>=0.22.0",
     "torch": "torch>=2.4",
     "torchaudio": "torchaudio",
     "torchvision": "torchvision",

It just removes the check. I'm not sure if 0.23.1 is compatible with transformers.

shayaknyc commented on 2026-04-29 14:19 (UTC)

Came here to say the same thing as @malium. Not sure how to fix this...?

malium commented on 2026-04-29 08:40 (UTC) (edited on 2026-04-29 08:54 (UTC) by malium)

It seems that python-tokenizers got bump up to 0.23.1 but python-transformers is not compatible with that version:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
    import transformers
  File "/home/user/.cache/yay/python-transformers/src/transformers-5.7.0/src/transformers/__init__.py", line 30, in <module>
    from . import dependency_versions_check
  File "/home/user/.cache/yay/python-transformers/src/transformers-5.7.0/src/transformers/dependency_versions_check.py", line 56, in <module>
    require_version_core(deps[pkg])
    ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/home/user/.cache/yay/python-transformers/src/transformers-5.7.0/src/transformers/utils/versions.py", line 116, in require_version_core
    return require_version(requirement, hint)
  File "/home/user/.cache/yay/python-transformers/src/transformers-5.7.0/src/transformers/utils/versions.py", line 110, in require_version
    _compare_versions(op, got_ver, want_ver, requirement, pkg, hint)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/.cache/yay/python-transformers/src/transformers-5.7.0/src/transformers/utils/versions.py", line 43, in _compare_versions
    raise ImportError(
        f"{requirement} is required for a normal functioning of this module, but found {pkg}=={got_ver}.{hint}"
    )
ImportError: tokenizers>=0.22.0,<=0.23.0 is required for a normal functioning of this module, but found tokenizers==0.23.1.
Try: `pip install transformers -U` or `pip install -e '.[dev]'` if you're working with git main

daskol commented on 2026-02-05 21:18 (UTC)

@shayaknyc I have bumped pkgrel of python-safetensors in order to trigger python-safetensors rebuild (or just notify that the package should be rebuilt).

shayaknyc commented on 2026-02-02 21:35 (UTC)

@lightdot - yes, you are correct. I manually had to recompile python-safetensors, and then this just worked. Thank you!

lightdot commented on 2026-02-02 21:33 (UTC) (edited on 2026-02-02 21:34 (UTC) by lightdot)

@shayaknyc, I suspect that python-safetensors package in your build environment wasn't rebuilt after python was updated to 3.14. This should be done manually, package version didn't change.

shayaknyc commented on 2026-02-02 19:45 (UTC)

Anyone else getting these build errors?

...
...
adding 'transformers/utils/versions.py'
adding 'transformers-5.0.0.dist-info/licenses/LICENSE'
adding 'transformers-5.0.0.dist-info/METADATA'
adding 'transformers-5.0.0.dist-info/WHEEL'
adding 'transformers-5.0.0.dist-info/entry_points.txt'
adding 'transformers-5.0.0.dist-info/top_level.txt'
adding 'transformers-5.0.0.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
Successfully built transformers-5.0.0-py3-none-any.whl
==> Starting check()...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    import transformers
  File "/home/user/git/python-transformers/src/transformers-5.0.0/src/transformers/__init__.py", line 30, in <module>
    from . import dependency_versions_check
  File "/home/user/git/python-transformers/src/transformers-5.0.0/src/transformers/dependency_versions_check.py", line 16, in <module>
    from .utils.versions import require_version, require_version_core
  File "/home/user/git/python-transformers/src/transformers-5.0.0/src/transformers/utils/__init__.py", line 22, in <module>
    from .auto_docstring import (
    ...<10 lines>...
    )
  File "/home/user/git/python-transformers/src/transformers-5.0.0/src/transformers/utils/auto_docstring.py", line 30, in <module>
    from .generic import ModelOutput
  File "/home/user/git/python-transformers/src/transformers-5.0.0/src/transformers/utils/generic.py", line 47, in <module>
    from ..model_debugging_utils import model_addition_debugger_context
  File "/home/user/git/python-transformers/src/transformers-5.0.0/src/transformers/model_debugging_utils.py", line 29, in <module>
    from safetensors.torch import save_file
ModuleNotFoundError: No module named 'safetensors'
==> ERROR: A failure occurred in check().
    Aborting...