Package Details: python-pdf2doi 1.7-1

Git Clone URL: https://aur.archlinux.org/python-pdf2doi.git (read-only, click to copy)
Package Base: python-pdf2doi
Description: A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.
Upstream URL: https://github.com/MicheleCotrufo/pdf2doi
Licenses: unknown
Submitter: sga013
Maintainer: sga013
Last Packager: sga013
Votes: 1
Popularity: 0.135727
First Submitted: 2024-11-17 18:25 (UTC)
Last Updated: 2024-11-17 18:25 (UTC)

Latest Comments

ra1nb0w commented on 2025-11-27 18:22 (UTC)

tar.gz: ops, my fault; I didn't cleaned my local copy so you are right. Therefore, you can use the url https://files.pythonhosted.org/packages/source/p/pdf2doi/pdf2doi-1.7.tar.gz

The settings.ini is already available in the source but it is not included in the whl; only the tar.gz has it.

about maintainership: thank you for the offer but I don't have the time to follow another package and honestly I only give it a try.

sga013 commented on 2025-11-27 17:25 (UTC) (edited on 2025-11-27 17:40 (UTC) by sga013)

Also additional note

the src file ideally should stay the python hosted one, mostly because the one from github releases is exactly a git clone, which has a examples folder, which has roughly 50mib of pdfs, which is completely wasteful.

I get that pypi url is wierd, and one has to copy it with each release, but i kinda expect them to, well, exist better (also less dependence on github).

Also, can you please separately provide the patch file? I tried to read the diff you added, and seemingly it just adds

package_data={'package':['settings.ini']},

which is fine, but i do not think this alnoe will generate the settigns file or will it? looking at the file from my current installation

[DEFAULT]
verbose = True
separator = /
save_identifier_metadata = True

And I am fairly sure this was autogenerated on first run.

sga013 commented on 2025-11-27 17:04 (UTC)

Thank you very much ra1nb0w. I will merge the suggested fixes (and will definitely credit you).

I made the pkgbuild by pip2pkgbuild, hence it is more generic and it does not handle the settigns.ini file (the workaround was to just run the command with sudo once, so it will make the file it self).

Truth is - I no longer use the pkg (to be specific - i no longer install pdf2doi with aur pkg, and use a python venv). I had thought of abandoning the pkg, but I know that there have not been any releases from upstream, so it is not broken (other than settings.ini issue), hence I do not have any urgency to abandon it. I have rss feed for pkg updates, so if it updates, i would still update it.

Would you be willing to overtake the maintainer ship? There is no pressure, and you can decline if you do not want to. But it would be great if you take over, since you actually use it.

ra1nb0w commented on 2025-11-27 08:47 (UTC) (edited on 2025-11-27 08:47 (UTC) by ra1nb0w)

to include settings.ini and some minor fixes

diff --git a/PKGBUILD b/PKGBUILD
index 9963671..28d1868 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -1,26 +1,32 @@
 # Maintainer: sga013
-pkgname='python-pdf2doi'
-_module='pdf2doi'
-_src_folder='pdf2doi-1.7'
-pkgver='1.7'
-pkgrel=1
+
+_pkgname="pdf2doi"
+pkgname="python-${_pkgname}"
+pkgver=1.7
+_src_folder="${_pkgname}-${pkgver}"
+pkgrel=2
 pkgdesc="A  python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file."
 url="https://github.com/MicheleCotrufo/pdf2doi"
 depends=('python' 'python-google' 'python-requests' 'python-pypdf2' 'python-pdftitle' 'python-feedparser' 'python-pyperclip' 'python-easygui' 'python-pdfminer' 'python-pymupdf' 'python-pypdf')
 makedepends=('python-build' 'python-installer' 'python-wheel')
-license=('unknown')
+license=('MIT')
 arch=('any')
-source=("https://files.pythonhosted.org/packages/c5/6f/867dd20a2467f5f7a17ef10b514219fcc7e6b2ae872e1f792ca22f2fb1e1/pdf2doi-1.7.tar.gz")
-sha256sums=('54d257bce59397ef577c588c8bc8a35930ffd1e7d05e7c3c454423bf5679bf2e')
+source=("add-settings.ini.patch"
+   "${_pkgname}-${pkgver}.tar.gz::https://github.com/MicheleCotrufo/${_pkgname}/archive/refs/tags/v${pkgver}.tar.gz")
+sha256sums=('0693e51daf75dfe17ceac30d0dd8e862950bd2b04ba32b10be16bf46465b1964'
+            '54d257bce59397ef577c588c8bc8a35930ffd1e7d05e7c3c454423bf5679bf2e')
+
+prepare() {
+    patch --directory="${_src_folder}" --forward --binary --strip=0 --input="${srcdir}/add-settings.ini.patch"
+}

 build() {
     cd "${srcdir}/${_src_folder}"
-    python -m build --wheel --no-isolation
+    python -m build --wheel
 }

 package() {
-
     cd "${srcdir}/${_src_folder}"
-    python -m installer --destdir="${pkgdir}" dist/*.whl
+    python -m installer --destdir="${pkgdir}" dist/${_pkgname}*.whl
 }

diff --git a/add-settings.ini.patch b/add-settings.ini.patch
new file mode 100644
index 0000000..e748411
--- /dev/null
+++ b/add-settings.ini.patch
@@ -0,0 +1,11 @@
+--- setup.py.orig  2025-11-27 08:52:15.967095697 +0100
++++ setup.py   2025-11-27 08:52:35.781316953 +0100
+@@ -21,5 +21,7 @@
+         'console_scripts': ["pdf2doi = pdf2doi.main:main"],
+       },
+       packages=['pdf2doi'],
++      package_data={'package':['settings.ini']},
++      include_package_data=True,
+       install_requires = required_packages,
+       zip_safe = False)
+\ No newline at end of file