aboutsummarylogtreecommitdiffstats
path: root/zstd-udpate-fixes.patch
diff options
context:
space:
mode:
Diffstat (limited to 'zstd-udpate-fixes.patch')
-rw-r--r--zstd-udpate-fixes.patch241
1 files changed, 0 insertions, 241 deletions
diff --git a/zstd-udpate-fixes.patch b/zstd-udpate-fixes.patch
deleted file mode 100644
index 70310ffd8409..000000000000
--- a/zstd-udpate-fixes.patch
+++ /dev/null
@@ -1,241 +0,0 @@
-From 9a8cd176f6ee90a3605d5dab75a4ab778e73f6f0 Mon Sep 17 00:00:00 2001
-From: Scott B <arglebargle@arglebargle.dev>
-Date: Sat, 20 Nov 2021 12:14:56 -0800
-Subject: [PATCH] zstd udpate fixes
-
-Squashed commit of the following:
-
-commit 82290fbdaffe7519c8451d5a8fe20af6faddd665
-Author: Nick Terrell <terrelln@fb.com>
-Date: Tue Nov 16 15:11:39 2021 -0800
-
- lib: zstd: Don't add -O3 to cflags
-
- After the update to zstd-1.4.10 passing -O3 is no longer necessary to
- get good performance from zstd. Using the default optimization level -O2
- is sufficient to get good performance.
-
- I've measured no significant change to compression speed, and a ~1%
- decompression speed loss, which is acceptable.
-
- This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
- gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
- stacks that are ~3KB. With -O2 these same functions generate stacks in
- the < 100B, completely fixing the problem. Function size deltas are
- listed below:
-
- ZSTD_compressBlock_fast_extDict_generic: 3800 -> 68
- ZSTD_compressBlock_fast: 2216 -> 40
- ZSTD_compressBlock_fast_dictMatchState: 1848 -> 64
- ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -> 76
- ZSTD_fillDoubleHashTable: 3252 -> 0
- ZSTD_compressBlock_doubleFast: 5856 -> 36
- ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -> 84
- ZSTD_copmressBlock_lazy2: 2420 -> 72
-
- Additionally, this improves the reported code bloat [1]. With gcc-11
- bloat-o-meter shows an 80KB code size improvement:
-
- ```
- > ../scripts/bloat-o-meter vmlinux.old vmlinux
- add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
- Total: Before=6418562, After=6336372, chg -1.28%
- ```
-
- Compared to before the zstd-1.4.10 update we see a total code size
- regression of 105KB, down from 374KB at v5.16-rc1:
-
- ```
- > ../scripts/bloat-o-meter vmlinux.old vmlinux
- add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
- Total: Before=6228850, After=6336372, chg +1.73%
- ```
-
- [0] https://lkml.org/lkml/2021/11/15/710
- [1] https://lkml.org/lkml/2021/11/14/189
-
- Link: https://lore.kernel.org/r/20211117014949.1169186-4-nickrterrell@gmail.com/
- Link: https://lore.kernel.org/r/20211117201459.1194876-4-nickrterrell@gmail.com/
-
- Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
- Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
- Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
- Signed-off-by: Nick Terrell <terrelln@fb.com>
-
-commit e764e672bf43ad4a3a3a85e79b267daaad406990
-Author: Nick Terrell <terrelln@fb.com>
-Date: Mon Nov 15 20:33:08 2021 -0800
-
- lib: zstd: Don't inline functions in zstd_opt.c
-
- `zstd_opt.c` contains the match finder for the highest compression
- levels. These levels are already very slow, and are unlikely to be used
- in the kernel. If they are used, they shouldn't be used in latency
- sensitive workloads, so slowing them down shouldn't be a big deal.
-
- This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
- I've also opened an issue upstream [1] so that we can properly tackle
- the code size issue in `zstd_opt.c` for all users, and can hopefully
- remove this hack in the next zstd version we import.
-
- Bloat-o-meter output on x86-64:
-
- ```
- > ../scripts/bloat-o-meter vmlinux.old vmlinux
- add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
- Function old new delta
- ZSTD_compressBlock_opt_generic.constprop - 7559 +7559
- ZSTD_insertBtAndGetAllMatches - 6304 +6304
- ZSTD_insertBt1 - 1731 +1731
- ZSTD_storeSeq - 693 +693
- ZSTD_BtGetAllMatches - 255 +255
- ZSTD_updateRep - 128 +128
- ZSTD_updateTree 96 99 +3
- ZSTD_insertAndFindFirstIndexHash3 81 - -81
- ZSTD_setBasePrices.constprop 98 - -98
- ZSTD_litLengthPrice.constprop 138 - -138
- ZSTD_count 362 181 -181
- ZSTD_count_2segments 1407 938 -469
- ZSTD_insertBt1.constprop 2689 - -2689
- ZSTD_compressBlock_btultra2 19990 423 -19567
- ZSTD_compressBlock_btultra 19633 15 -19618
- ZSTD_initStats_ultra 19825 - -19825
- ZSTD_compressBlock_btopt 20374 12 -20362
- ZSTD_compressBlock_btopt_extDict 29984 12 -29972
- ZSTD_compressBlock_btultra_extDict 30718 15 -30703
- ZSTD_compressBlock_btopt_dictMatchState 32689 12 -32677
- ZSTD_compressBlock_btultra_dictMatchState 33574 15 -33559
- Total: Before=6611828, After=6418562, chg -2.92%
- ```
-
- [0] https://lkml.org/lkml/2021/11/14/189
- [1] https://github.com/facebook/zstd/issues/2862
-
- Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
- Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/
-
- Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
- Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
- Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
- Signed-off-by: Nick Terrell <terrelln@fb.com>
-
-commit e5fa8f8334f188f496e02c0c5dcb1f9234fec2fb
-Author: Nick Terrell <terrelln@fb.com>
-Date: Mon Nov 15 19:08:19 2021 -0800
-
- lib: zstd: Fix unused variable warning
-
- The variable `litLengthSum` is only used by an `assert()`, so when
- asserts are disabled the compiler doesn't see any usage and warns.
-
- This issue is already fixed upstream by PR #2838 [0]. It was reported
- by the Kernel test robot in [1].
-
- Another approach would be to change zstd's disabled `assert()`
- definition to use the argument in a disabled branch, instead of
- ignoring the argument. I've avoided this approach because there are
- some small changes necessary to get zstd to build, and I would
- want to thoroughly re-test for performance, since that is slightly
- changing the code in every function in zstd. It seems like a
- trivial change, but some functions are pretty sensitive to small
- changes. However, I think it is a valid approach that I would
- like to see upstream take, so I've opened Issue #2868 to attempt
- this upstream.
-
- Lastly, I've chosen not to use __maybe_unused because all code
- in lib/zstd/ must eventually be upstreamed. Upstream zstd can't
- use __maybe_unused because it isn't portable across all compilers.
-
- [0] https://github.com/facebook/zstd/pull/2838
- [1] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
- [2] https://github.com/facebook/zstd/issues/2868
-
- Link: https://lore.kernel.org/r/20211117014949.1169186-2-nickrterrell@gmail.com/
- Link: https://lore.kernel.org/r/20211117201459.1194876-2-nickrterrell@gmail.com/
-
- Reported-by: kernel test robot <lkp@intel.com>
- Signed-off-by: Nick Terrell <terrelln@fb.com>
----
- lib/zstd/Makefile | 2 --
- lib/zstd/common/compiler.h | 7 +++++++
- lib/zstd/compress/zstd_compress_superblock.c | 2 ++
- lib/zstd/compress/zstd_opt.c | 12 ++++++++++++
- 4 files changed, 21 insertions(+), 2 deletions(-)
-
-diff --git a/lib/zstd/Makefile b/lib/zstd/Makefile
-index 65218ec5b8f2..fc45339fc3a3 100644
---- a/lib/zstd/Makefile
-+++ b/lib/zstd/Makefile
-@@ -11,8 +11,6 @@
- obj-$(CONFIG_ZSTD_COMPRESS) += zstd_compress.o
- obj-$(CONFIG_ZSTD_DECOMPRESS) += zstd_decompress.o
-
--ccflags-y += -O3
--
- zstd_compress-y := \
- zstd_compress_module.o \
- common/debug.o \
-diff --git a/lib/zstd/common/compiler.h b/lib/zstd/common/compiler.h
-index a1a051e4bce6..f5a9c70a228a 100644
---- a/lib/zstd/common/compiler.h
-+++ b/lib/zstd/common/compiler.h
-@@ -16,6 +16,7 @@
- *********************************************************/
- /* force inlining */
-
-+#if !defined(ZSTD_NO_INLINE)
- #if (defined(__GNUC__) && !defined(__STRICT_ANSI__)) || defined(__cplusplus) || defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L /* C99 */
- # define INLINE_KEYWORD inline
- #else
-@@ -24,6 +25,12 @@
-
- #define FORCE_INLINE_ATTR __attribute__((always_inline))
-
-+#else
-+
-+#define INLINE_KEYWORD
-+#define FORCE_INLINE_ATTR
-+
-+#endif
-
- /*
- On MSVC qsort requires that functions passed into it use the __cdecl calling conversion(CC).
-diff --git a/lib/zstd/compress/zstd_compress_superblock.c b/lib/zstd/compress/zstd_compress_superblock.c
-index ee03e0aedb03..b0610b255653 100644
---- a/lib/zstd/compress/zstd_compress_superblock.c
-+++ b/lib/zstd/compress/zstd_compress_superblock.c
-@@ -411,6 +411,8 @@ static size_t ZSTD_seqDecompressedSize(seqStore_t const* seqStore, const seqDef*
- const seqDef* sp = sstart;
- size_t matchLengthSum = 0;
- size_t litLengthSum = 0;
-+ /* Only used by assert(), suppress unused variable warnings in production. */
-+ (void)litLengthSum;
- while (send-sp > 0) {
- ZSTD_sequenceLength const seqLen = ZSTD_getSequenceLength(seqStore, sp);
- litLengthSum += seqLen.litLength;
-diff --git a/lib/zstd/compress/zstd_opt.c b/lib/zstd/compress/zstd_opt.c
-index 70adc7c2cc5e..09483f518dc3 100644
---- a/lib/zstd/compress/zstd_opt.c
-+++ b/lib/zstd/compress/zstd_opt.c
-@@ -8,6 +8,18 @@
- * You may select, at your option, one of the above-listed licenses.
- */
-
-+/*
-+ * Disable inlining for the optimal parser for the kernel build.
-+ * It is unlikely to be used in the kernel, and where it is used
-+ * latency shouldn't matter because it is very slow to begin with.
-+ * We prefer a ~180KB binary size win over faster optimal parsing.
-+ *
-+ * TODO(https://github.com/facebook/zstd/issues/2862):
-+ * Improve the code size of the optimal parser in general, so we
-+ * don't need this hack for the kernel build.
-+ */
-+#define ZSTD_NO_INLINE 1
-+
- #include "zstd_compress_internal.h"
- #include "hist.h"
- #include "zstd_opt.h"
---
-2.34.0
-