diff options
-rw-r--r-- | .CHANGELOG | 90 | ||||
-rw-r--r-- | .SRCINFO | 26 | ||||
-rw-r--r-- | PKGBUILD | 53 |
3 files changed, 145 insertions, 24 deletions
diff --git a/.CHANGELOG b/.CHANGELOG index d0ccdd897c22..2e544239cf9d 100644 --- a/.CHANGELOG +++ b/.CHANGELOG @@ -1,8 +1,96 @@ # Changelog +## v0.19.0 + +*Jan 25, 2021* + +This release comes with major improvements to the text analysis +module. It is now much more configurable, has improved results and can +learn tags from all categories. Additionally, more languages for +document processing have been added and it's now easier to add more. +Please open an issue if want more languages to be included. + +- text analysis improvements (#263, #570) + - docspell can now learn from all your tag categories + - the detection for correspondents/concerned entities has been + improved by using the classifier for this, too + - all text analysis steps are now configurable that makes it + possible to adapt it better to your data and machine. + - The docs have been updated with some details + [here](https://docspell.org/docs/configure/#file-processing) and + [here](https://docspell.org/docs/joex/file-processing/#text-analysis). +- more languages (#488) + - Adds: Spanish, Italian, Portuguese, Czech, Dutch, Danish, Finnish, + Norwegian, Swedish, Russian, Romanian + - languages have different support for text-analysis, but there is + some basic support for all + - there is extended support for English, German and French through + [Stanford CoreNLP](https://stanfordnlp.github.io/CoreNLP/) nlp + models (as before) +- scan mailbox change (#576) + - The change from last version (#551) has been moved behind a flag + in the "scan mailbox settings". Please review your scan mailbox + tasks in your user settings. + - The scan mailbox settings form view has been organized into tabs, + as it grew too large for a single form. +- nix tools package fixed (#584) + - If you are using docspell tools package for nix, it has now been + fixed in that all scripts are available. They are now all prefixed + by `ds-` (except the `ds` script) +- fix deleting organization (#578) + - Due to the new relationship of a person to an organization, + deleting an organization whith references a person was not + possible. This is now fixed. +- base url fix (#579) + - The `baseurl` setting is optional, but when specified it was + required to omit a trailing slash. This is now fixed in that it is + always rendered without the trailing slash to the client, no + matter what is in the config +- tag category case sensitive search fix (#568) + - This was a bug introduced by the last release. When tag categories + can now be spelled upper- or lower-case. In 0.18.0 you had to + spell them lowercase, otherwise the search doesn't work. +- adds a workaround for mails that don't specify their used charset (#591) + +### Breaking Changes + +- The joex configuration changed around text analysis. If you had some + custom settings there, please review these wrt the new default + config. +- When using the nix package manager: the tools package renamed the + scripts to be better distinguishable, since they all end up in + `$PATH`. They are now prefixed by `ds-`. +- The path of the consumedir script changed in the consumedir docker + image +- The settings of the scan-mailbox task has been extended by another + flag. It controls when to apply the post-processing (moving or + deleting). If you were relying that all mails (even those excluded + by a subject filter) where moved away, you need to check your + scan-mailbox task settings. + +### REST Api Changes + +- the data structure for `ClassifierSettings` changed to allow + specfiying a blacklist or whitelist of tag categories and the + `enabled` flag has been removed. + + +### Configuration Changes + +- joex + - the config regarding text analysis changed, there are new config + options, like `nlp.mode` and the `max-due-date-years` has been + moved inside `text-anlysis`. Please have a look at the new + [default config](https://docspell.org/docs/configure/#joex) if you + changed something there. + - The `regex-ner` section has changed: the `enabled` flag has been + removed, you can now limit the number of entries using + `max-entries` to apply and `0` means to disable it. + + ## v0.18.0 -*Jan 11, 2021 * +*Jan 11, 2021* - Feature: Results summary and updated tag count (#496, #333) - A search summary can be displayed that shows the overall result @@ -1,24 +1,25 @@ pkgbase = docspell pkgdesc = Assists in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort. - pkgver = 0.18.0 + pkgver = 0.19.0 pkgrel = 1 - url = https://github.com/eikek/docspell + url = https://docspell.org/ changelog = .CHANGELOG arch = any groups = docspell license = GPL3 - depends = java-runtime-headless - optdepends = solr: provide fulltext search - source = docspell-0.18.0-restserver.zip::https://github.com/eikek/docspell/releases/download/v0.18.0/docspell-restserver-0.18.0.zip - source = docspell-0.18.0-joex.zip::https://github.com/eikek/docspell/releases/download/v0.18.0/docspell-joex-0.18.0.zip + makedepends = python + source = docspell-0.19.0-restserver.zip::https://github.com/eikek/docspell/releases/download/v0.19.0/docspell-restserver-0.19.0.zip + source = docspell-0.19.0-joex.zip::https://github.com/eikek/docspell/releases/download/v0.19.0/docspell-joex-0.19.0.zip + source = docspell-0.19.0-tools.zip::https://github.com/eikek/docspell/releases/download/v0.19.0/docspell-tools-0.19.0.zip source = docspell-joex.sh source = docspell-restserver.sh source = docspell-joex.service source = docspell-restserver.service source = docspell.sysusers source = docspell.tmpfiles - sha512sums = 7c56b72970d85be635fe47098f917a9c1356b788c59c5abbdeb60eb065bad856b0b7066a5adcc4ccc6c37a837090a7bf558ce8ab9beca94ad04688127fceb4d1 - sha512sums = 56172c3d0da239280b48c5b4e3356283ed9f6edcf5e70ad9ac7e9be78de203142e53ceb8abd6a9320e4694ca18d88f4875cd82c7d39bf4549d4d2323147daddf + sha512sums = 1fd070456dde479d160fdd6179762ad7928e10eb721824dfdf5524101cf7a926374bc3f1794d53dc764c172c7b3137f27a41b49050634b1284077eccec2634cc + sha512sums = 1f91bdb47c3ea154423ee4e7096e4975b7c79509f26eaa6a3315130dbc3747af532a5741a27f4c59519b5e9664c18b65282ed51fcc8c205476ffc67eecbac295 + sha512sums = 115cbbf8bfc2ef234fba7b98381dee04354ff6bc50302b285eb16ef51497f4a695aeed790d78c401c63baad06c2a35910d969b9b35a0c76879d77bd859533a62 sha512sums = 6ab8b24eb76f02b68e4fa4194b8771ef4f57c8375b34bf7bf914563528e347ea127beb5547e432910911d4fd15982cccdd1df50aeb76058129b909824ce49093 sha512sums = 0b8b08f47f1cb46a3bfc16df4b0574cebfb4a851562d134fcba3c4bf80fb011443499a549c3a04480456c048346d09f36fbcbc9d792810001c9c8b370d3926a8 sha512sums = f63f0fa58715b7da01aa265a7bec72eb24f0e98c354eed479b6034bc33b2ccdaef87db8a7630af1d5a6ac43fadf11a0f0a3fb3de5e183aa64d838a69b67125f9 @@ -28,17 +29,22 @@ pkgbase = docspell pkgname = docspell-joex pkgdesc = Assists in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort. (Job executer) - depends = java-runtime-headless depends = ghostscript + depends = java-runtime-headless depends = tesseract depends = unoconv depends = wkhtmltopdf - optdepends = solr: provide fulltext search optdepends = ocrmypdf: adds an OCR layer to scanned PDF files to make them searchable optdepends = unpaper: pre-processes images to yield better results when doing ocr backup = etc/docspell/joex.conf pkgname = docspell-restserver pkgdesc = Assists in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort. (Server) + depends = java-runtime-headless + optdepends = solr: provide fulltext search backup = etc/docspell/restserver.conf +pkgname = docspell-tools + pkgdesc = Collection of tools to interact with Docspell + depends = python + @@ -2,27 +2,27 @@ # shellcheck disable=SC2034,2154,2148 pkgbase=docspell -pkgname=('docspell-joex' 'docspell-restserver') -pkgver=0.18.0 +pkgname=('docspell-joex' 'docspell-restserver' 'docspell-tools') +pkgver=0.19.0 pkgrel=1 changelog=.CHANGELOG arch=('any') -url="https://github.com/eikek/docspell" +url="https://docspell.org/" pkgdesc="Assists in organizing your piles of documents, resulting from scanners, e-mails and other sources with miminal effort." license=('GPL3') groups=('docspell') -depends=('java-runtime-headless') -optdepends=('solr: provide fulltext search') source=("$pkgbase-$pkgver-restserver.zip::https://github.com/eikek/$pkgbase/releases/download/v$pkgver/$pkgbase-restserver-$pkgver.zip" "$pkgbase-$pkgver-joex.zip::https://github.com/eikek/$pkgbase/releases/download/v$pkgver/$pkgbase-joex-$pkgver.zip" + "$pkgbase-$pkgver-tools.zip::https://github.com/eikek/$pkgbase/releases/download/v$pkgver/$pkgbase-tools-$pkgver.zip" "${pkgname[0]}.sh" "${pkgname[1]}.sh" "${pkgname[0]}.service" "${pkgname[1]}.service" "$pkgbase.sysusers" "$pkgbase.tmpfiles") -sha512sums=('7c56b72970d85be635fe47098f917a9c1356b788c59c5abbdeb60eb065bad856b0b7066a5adcc4ccc6c37a837090a7bf558ce8ab9beca94ad04688127fceb4d1' - '56172c3d0da239280b48c5b4e3356283ed9f6edcf5e70ad9ac7e9be78de203142e53ceb8abd6a9320e4694ca18d88f4875cd82c7d39bf4549d4d2323147daddf' +sha512sums=('1fd070456dde479d160fdd6179762ad7928e10eb721824dfdf5524101cf7a926374bc3f1794d53dc764c172c7b3137f27a41b49050634b1284077eccec2634cc' + '1f91bdb47c3ea154423ee4e7096e4975b7c79509f26eaa6a3315130dbc3747af532a5741a27f4c59519b5e9664c18b65282ed51fcc8c205476ffc67eecbac295' + '115cbbf8bfc2ef234fba7b98381dee04354ff6bc50302b285eb16ef51497f4a695aeed790d78c401c63baad06c2a35910d969b9b35a0c76879d77bd859533a62' '6ab8b24eb76f02b68e4fa4194b8771ef4f57c8375b34bf7bf914563528e347ea127beb5547e432910911d4fd15982cccdd1df50aeb76058129b909824ce49093' '0b8b08f47f1cb46a3bfc16df4b0574cebfb4a851562d134fcba3c4bf80fb011443499a549c3a04480456c048346d09f36fbcbc9d792810001c9c8b370d3926a8' 'f63f0fa58715b7da01aa265a7bec72eb24f0e98c354eed479b6034bc33b2ccdaef87db8a7630af1d5a6ac43fadf11a0f0a3fb3de5e183aa64d838a69b67125f9' @@ -32,9 +32,12 @@ sha512sums=('7c56b72970d85be635fe47098f917a9c1356b788c59c5abbdeb60eb065bad856b0b prepare() { # shellcheck disable=2016 - sed -i 's@url = "jdbc:h2:\/\/"\${java\.io\.tmpdir}"@url = "jdbc:h2:///var/lib/docspell@' \ + sed -i -e 's@url = "jdbc:h2:\/\/"\${java\.io\.tmpdir}"@url = "jdbc:h2:///var/lib/docspell@' \ "${pkgname[0]}-$pkgver/conf/${pkgname[0]}.conf" \ "${pkgname[1]}-$pkgver/conf/$pkgbase-server.conf" + + sed -i -e 's@/usr/local/share/docspell/native.py@/usr/share/docspell-tools/native.py@' \ + "${pkgname[2]}-$pkgver/firefox/native/app_manifest.json" } # You do not need to compile Java applications from source. @@ -47,9 +50,9 @@ prepare() { package_docspell-joex() { pkgdesc+=" (Job executer)" - depends+=('ghostscript' 'tesseract' 'unoconv' 'wkhtmltopdf') - optdepends+=('ocrmypdf: adds an OCR layer to scanned PDF files to make them searchable' - 'unpaper: pre-processes images to yield better results when doing ocr') + depends=('ghostscript' 'java-runtime-headless' 'tesseract' 'unoconv' 'wkhtmltopdf') + optdepends=('ocrmypdf: adds an OCR layer to scanned PDF files to make them searchable' + 'unpaper: pre-processes images to yield better results when doing ocr') backup=("etc/docspell/joex.conf") install -Dm 755 "${pkgname[0]}.sh" "$pkgdir/usr/bin/${pkgname[0]}" @@ -67,7 +70,7 @@ package_docspell-joex() { # make directories mkdir -p "$pkgdir/usr/share/java/${pkgname[0]}" - # copy documentary + # copy java libs cp -dpr --no-preserve=ownership \ `# SRCFILES:` \ "lib/." \ @@ -77,6 +80,8 @@ package_docspell-joex() { package_docspell-restserver() { pkgdesc+=" (Server)" + depends=('java-runtime-headless') + optdepends=('solr: provide fulltext search') backup=("etc/docspell/restserver.conf") install -Dm 755 "${pkgname[1]}.sh" "$pkgdir/usr/bin/${pkgname[1]}" @@ -94,10 +99,32 @@ package_docspell-restserver() { # make directories mkdir -p "$pkgdir/usr/share/java/${pkgname[1]}" - # copy documentary + # copy java libs cp -dpr --no-preserve=ownership \ `# SRCFILES:` \ "lib/." \ `# DSTDIR:` \ "$pkgdir/usr/share/java/${pkgname[1]}/" } + +makedepends+=('python') +package_docspell-tools() { + pkgdesc="Collection of tools to interact with Docspell" + depends=('python') + + cd "${pkgname[2]}-$pkgver" || return + + # Firefox extension and native messaging host + mkdir -p "$pkgdir/usr/share/${pkgname[2]}" + mkdir -p "$pkgdir/usr/lib/mozilla/native-messaging-hosts" + install -Dm 644 "firefox/$pkgbase-extension.xpi" "$pkgdir/usr/lib/firefox/browser/extensions/docspell@eikek.github.io.xpi" + install -Dm 755 "firefox/native/native.py" "$pkgdir/usr/share/${pkgname[2]}/firefox/native/native.py" + ln -s "/usr/share/${pkgname[2]}/firefox/native/app_manifest.json" "$pkgdir/usr/lib/mozilla/native-messaging-hosts/$pkgbase.json" + + # https://wiki.archlinux.org/index.php/Python_package_guidelines#Reproducible_bytecode + export PYTHONHASHSEED=0 + python -O -m compileall "$pkgdir/usr/share/${pkgname[2]}/firefox/native/native.py" + + # Scripts + find . -type f -name "*.sh" -exec sh -c 'install -Dm 755 "$3" "$1/usr/bin/$2-$(basename $3)"' _ "$pkgdir" "$pkgbase" {} \; +} |