summarylogtreecommitdiffstats
path: root/0008-x86-MCE-AMD-Allow-Reserved-types-to-be-overwritten-i.patch
diff options
context:
space:
mode:
Diffstat (limited to '0008-x86-MCE-AMD-Allow-Reserved-types-to-be-overwritten-i.patch')
-rw-r--r--0008-x86-MCE-AMD-Allow-Reserved-types-to-be-overwritten-i.patch96
1 files changed, 96 insertions, 0 deletions
diff --git a/0008-x86-MCE-AMD-Allow-Reserved-types-to-be-overwritten-i.patch b/0008-x86-MCE-AMD-Allow-Reserved-types-to-be-overwritten-i.patch
new file mode 100644
index 000000000000..67d5d92c7092
--- /dev/null
+++ b/0008-x86-MCE-AMD-Allow-Reserved-types-to-be-overwritten-i.patch
@@ -0,0 +1,96 @@
+From 14e0c3a2f956421e0731a1b5e474b3428c8bda24 Mon Sep 17 00:00:00 2001
+From: Yazen Ghannam <yazen.ghannam@amd.com>
+Date: Thu, 21 Nov 2019 08:15:08 -0600
+Subject: [PATCH 08/20] x86/MCE/AMD: Allow Reserved types to be overwritten in
+ smca_banks[]
+
+Each logical CPU in Scalable MCA systems controls a unique set of MCA
+banks in the system. These banks are not shared between CPUs. The bank
+types and ordering will be the same across CPUs on currently available
+systems.
+
+However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while
+other CPUs do not. In this case, the bank seen as Reserved on one CPU is
+assumed to be the same type as the bank seen as a known type on another
+CPU.
+
+In general, this occurs when the hardware represented by the MCA bank
+is disabled, e.g. disabled memory controllers on certain models, etc.
+The MCA bank is disabled in the hardware, so there is no possibility of
+getting an MCA/MCE from it even if it is assumed to have a known type.
+
+For example:
+
+Full system:
+ Bank | Type seen on CPU0 | Type seen on CPU1
+ ------------------------------------------------
+ 0 | LS | LS
+ 1 | UMC | UMC
+ 2 | CS | CS
+
+System with hardware disabled:
+ Bank | Type seen on CPU0 | Type seen on CPU1
+ ------------------------------------------------
+ 0 | LS | LS
+ 1 | UMC | RAZ
+ 2 | CS | CS
+
+For this reason, there is a single, global struct smca_banks[] that is
+initialized at boot time. This array is initialized on each CPU as it
+comes online. However, the array will not be updated if an entry already
+exists.
+
+This works as expected when the first CPU (usually CPU0) has all
+possible MCA banks enabled. But if the first CPU has a subset, then it
+will save a "Reserved" type in smca_banks[]. Successive CPUs will then
+not be able to update smca_banks[] even if they encounter a known bank
+type.
+
+This may result in unexpected behavior. Depending on the system
+configuration, a user may observe issues enumerating the MCA
+thresholding sysfs interface. The issues may be as trivial as sysfs
+entries not being available, or as severe as system hangs.
+
+For example:
+
+ Bank | Type seen on CPU0 | Type seen on CPU1
+ ------------------------------------------------
+ 0 | LS | LS
+ 1 | RAZ | UMC
+ 2 | CS | CS
+
+Extend the smca_banks[] entry check to return if the entry is a
+non-reserved type. Otherwise, continue so that CPUs that encounter a
+known bank type can update smca_banks[].
+
+Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type")
+Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
+Signed-off-by: Borislav Petkov <bp@suse.de>
+Cc: "H. Peter Anvin" <hpa@zytor.com>
+Cc: Ingo Molnar <mingo@kernel.org>
+Cc: linux-edac <linux-edac@vger.kernel.org>
+Cc: <stable@vger.kernel.org>
+Cc: Thomas Gleixner <tglx@linutronix.de>
+Cc: Tony Luck <tony.luck@intel.com>
+Cc: x86-ml <x86@kernel.org>
+Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com
+---
+ arch/x86/kernel/cpu/mce/amd.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
+index c7ab0d38af79..259f3f4e2e5f 100644
+--- a/arch/x86/kernel/cpu/mce/amd.c
++++ b/arch/x86/kernel/cpu/mce/amd.c
+@@ -266,7 +266,7 @@ static void smca_configure(unsigned int bank, unsigned int cpu)
+ smca_set_misc_banks_map(bank, cpu);
+
+ /* Return early if this bank was already initialized. */
+- if (smca_banks[bank].hwid)
++ if (smca_banks[bank].hwid && smca_banks[bank].hwid->hwid_mcatype != 0)
+ return;
+
+ if (rdmsr_safe(MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) {
+--
+2.24.1
+