Age | Commit message (Collapse) | Author |
|
|
|
[ Upstream commit 475c58e1a471e9b873e3e39958c64a2d278275c8 ]
Enabling -Wstringop-overflow globally exposes a warning for a common bug
in the usage of strncat():
drivers/edac/thunderx_edac.c: In function 'thunderx_ocx_com_threaded_isr':
drivers/edac/thunderx_edac.c:1136:17: error: 'strncat' specified bound 1024 equals destination size [-Werror=stringop-overflow=]
1136 | strncat(msg, other, OCX_MESSAGE_SIZE);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
1145 | strncat(msg, other, OCX_MESSAGE_SIZE);
...
1150 | strncat(msg, other, OCX_MESSAGE_SIZE);
...
Apparently the author of this driver expected strncat() to behave the
way that strlcat() does, which uses the size of the destination buffer
as its third argument rather than the length of the source buffer. The
result is that there is no check on the size of the allocated buffer.
Change it to strlcat().
[ bp: Trim compiler output, fixup commit message. ]
Fixes: 41003396f932 ("EDAC, thunderx: Add Cavium ThunderX EDAC driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20231122222007.3199885-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
[ Upstream commit 71b1e3ba3fed5a34c5fac6d3a15c2634b04c1eb7 ]
The current DRAM row address mapping arrays skx_{open,close}_row[]
only support ranks with sizes up to 16G. Decoding a rank address
to a DRAM row address for a 32G rank by using either one of the
above arrays by the skx_edac driver, will result in an overflow on
the array.
For a 32G rank, the most significant DRAM row address bit (the
bit17) is mapped from the bit34 of the rank address. Add this new
mapping item to both arrays to fix the overflow issue.
Fixes: 4ec656bdf43a ("EDAC, skx_edac: Add EDAC driver for Skylake")
Reported-by: Feng Xu <feng.f.xu@intel.com>
Tested-by: Feng Xu <feng.f.xu@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230211011728.71764-1-qiuxu.zhuo@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
# Conflicts:
# drivers/dma/dmaengine.c
|
|
commit 977c6ba624f24ae20cf0faee871257a39348d4a9 upstream.
The memory for llcc_driv_data is allocated by the LLCC driver. But when
it is passed as the private driver info to the EDAC core, it will get freed
during the qcom_edac driver release. So when the qcom_edac driver gets probed
again, it will try to use the freed data leading to the use-after-free bug.
Hence, do not pass llcc_driv_data as pvt_info but rather reference it
using the platform_data pointer in the qcom_edac driver.
Fixes: 27450653f1db ("drivers: edac: Add EDAC driver support for QCOM SoCs")
Reported-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Steev Klimaszewski <steev@kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Cc: <stable@vger.kernel.org> # 4.20
Link: https://lore.kernel.org/r/20230118150904.26913-4-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit cec669ff716cc83505c77b242aecf6f7baad869d upstream.
The EDAC drivers may optionally pass the poll_msec value. Use that value
if available, else fall back to 1000ms.
[ bp: Touchups. ]
Fixes: e27e3dac6517 ("drivers/edac: add edac_device class")
Reported-by: Luca Weiss <luca.weiss@fairphone.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Steev Klimaszewski <steev@kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8540p-ride
Cc: <stable@vger.kernel.org> # 4.9
Link: https://lore.kernel.org/r/COZYL8MWN97H.MROQ391BGA09@otso
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e7a293658c20a7945014570e1921bf7d25d68a36 ]
When devres_open_group() fails, it returns -ENOMEM without freeing memory
allocated by edac_mc_alloc().
Call edac_mc_free() on the error handling path to avoid a memory leak.
[ bp: Massage commit message. ]
Fixes: a1b01edb2745 ("edac: add support for Calxeda highbank memory controller")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20221229054825.1361993-1-linmq006@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
commit e84077437902ec99eba0a6b516df772653f142c7 upstream.
Fix period calculation in case user sets a value of 1000. The input of
round_jiffies_relative() should be in jiffies and not in milli-seconds.
[ bp: Use the same code pattern as in edac_device_workq_setup() for
clarity. ]
Fixes: c4cf3b454eca ("EDAC: Rework workqueue handling")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20221020124458.22153-1-farbere@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9c8921555907f4d723f01ed2d859b66f2d14f08e ]
As the comment of pci_get_domain_bus_and_slot() says, it returns
a PCI device with refcount incremented, so it doesn't need to
call an extra pci_dev_get() in pci_get_dev_wrapper(), and the PCI
device needs to be put in the error path.
Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20221128065512.3572550-1-yangyingliang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
commit e2932d1f6f055b2af2114c7e64a26dc1b5593d0c upstream.
Currently, the error count is read wrongly from the status register. Read
the count from the proper error count register (ERRCNT).
[ bp: Massage. ]
Fixes: b500b4a029d5 ("EDAC, synopsys: Add ECC support for ZynqMP DDR controller")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220414102813.4468-1-shubhrajyoti.datta@xilinx.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
commit f8efca92ae509c25e0a4bd5d0a86decea4f0c41e upstream.
Do alignment logic properly and use the "ptr" local variable for
calculating the remainder of the alignment.
This became an issue because struct edac_mc_layer has a size that is not
zero modulo eight, and the next offset that was prepared for the private
data was unaligned, causing an alignment exception.
The patch in Fixes: which broke this actually wanted to "what we
actually care about is the alignment of the actual pointer that's about
to be returned." But it didn't check that alignment.
Use the correct variable "ptr" for that.
[ bp: Massage commit message. ]
Fixes: 8447c4d15e35 ("edac: Do alignment logic properly in edac_align_ptr()")
Signed-off-by: Eliav Farber <farbere@amazon.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220113100622.12783-2-farbere@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
commit dfd0dfb9a7cc04acf93435b440dd34c2ca7b4424 upstream.
The driver overrides error codes returned by platform_get_irq_optional()
to -EINVAL for some strange reason, so if it returns -EPROBE_DEFER, the
driver will fail the probe permanently instead of the deferred probing.
Switch to propagating the proper error codes to platform driver code
upwards.
[ bp: Massage commit message. ]
Fixes: 0d4429301c4a ("EDAC: Add APM X-Gene SoC EDAC driver")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220124185503.6720-3-s.shtylyov@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 279eb8575fdaa92c314a54c0d583c65e26229107 upstream.
The driver overrides the error codes returned by platform_get_irq() to
-ENODEV for some strange reason, so if it returns -EPROBE_DEFER, the
driver will fail the probe permanently instead of the deferred probing.
Switch to propagating the proper error codes to platform driver code
upwards.
[ bp: Massage commit message. ]
Fixes: 71bcada88b0f ("edac: altera: Add Altera SDRAM EDAC support")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220124185503.6720-2-s.shtylyov@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
[ Upstream commit bd1d6da17c296bd005bfa656952710d256e77dd3 ]
Version 2.40a supports DDR_ECC_INTR_SUPPORT for a quirk, so use that
quirk to determine a call to setup_address_map().
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lkml.kernel.org/r/20211012190709.1504152-1-dinguyen@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
[ Upstream commit 9f4873fb6af7966de8fcbd95c36b61351c1c4b1f ]
AMD Rome systems and later support interleaving between three identical
ranks within a channel.
Check for this mode by counting the number of enabled chip selects and
comparing their masks. If there are exactly three enabled chip selects
and their masks are identical, then three rank interleaving is enabled.
The size of a rank is determined from its mask value. However, three
rank interleaving doesn't follow the method of swapping an interleave
bit with the most significant bit. Rather, the interleave bit is flipped
and the most significant bit remains the same. There is only a single
interleave bit in this case.
Account for this when determining the chip select size by keeping the
most significant bit at its original value and ignoring any zero bits.
This will return a full bitmask in [MSB:1].
Fixes: e53a3b267fb0 ("EDAC/amd64: Find Chip Select memory size using Address Mask")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211005154419.2060504-1-yazen.ghannam@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 537bddd069c743759addf422d0b8f028ff0f8dbc upstream.
The computation of TOHM is off by one bit. This missed bit results in
too low a value for TOHM, which can cause errors in regular memory to
incorrectly report:
EDAC MC0: 1 CE Error at MMIOH area, on addr 0x000000207fffa680 on any memory
Fixes: 50d1bb93672f ("sb_edac: add support for Haswell based systems")
Cc: stable@vger.kernel.org
Reported-by: Meeta Saggi <msaggi@purestorage.com>
Signed-off-by: Eric Badger <ebadger@purestorage.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20211010170127.848113-1-ebadger@purestorage.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
commit d9b7748ffc45250b4d7bcf22404383229bc495f5 upstream.
The number of correctable errors is displayed as uncorrectable
errors because the "SBE" error count is passed to both calls of
edac_mc_handle_error().
Pass the correct uncorrectable error count to the second
edac_mc_handle_error() call when logging uncorrectable errors.
[ bp: Massage commit message. ]
Fixes: 7f6998a41257 ("ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC")
Signed-off-by: Hans Potsch <hans.potsch@nokia.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211006121332.58788-1-hans.potsch@nokia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
commit 5297cfa6bdf93e3889f78f9b482e2a595a376083 upstream.
dimm->edac_mode contains values of type enum edac_type - not the
corresponding capability flags. Fix that.
Issue caught by Coverity check "enumerated type mixed with another
type."
[ bp: Rewrite commit message, add tags. ]
Fixes: ae9b56e3996d ("EDAC, synps: Add EDAC support for zynq ddr ecc controller")
Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com>
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20210818072315.15149-1-shubhrajyoti.datta@xilinx.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
|
|
[ Upstream commit 2294a7299f5e51667b841f63c6d69474491753fb ]
MCDDRCFG is a per-channel register and uses bit{0,1} to indicate
the NVDIMM presence on DIMM slot{0,1}. Current i10nm_edac driver
wrongly uses MCDDRCFG as per-DIMM register and fails to detect
the NVDIMM.
Fix it by reading MCDDRCFG as per-channel register and using its
bit{0,1} to check whether the NVDIMM is populated on DIMM slot{0,1}.
Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Reported-by: Fan Du <fan.du@intel.com>
Tested-by: Wen Jin <wen.jin@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210818175701.1611513-2-tony.luck@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
[ Upstream commit f0a029fff4a50eb01648810a77ba1873e829fdd4 ]
There's little to no point in loading an EDAC driver running in a guest:
1) The CPU model reported by CPUID may not represent actual h/w
2) The hypervisor likely does not pass in access to memory controller devices
3) Hypervisors generally do not pass corrected error details to guests
Add a check in each of the Intel EDAC drivers for X86_FEATURE_HYPERVISOR
and simply return -ENODEV in the init routine.
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210615174419.GA1087688@agluck-desk2.amr.corp.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0a37f32ba5272b2d4ec8c8d0f6b212b81b578f7e ]
The module misses MODULE_DEVICE_TABLE() for of_device_id tables and thus
never autoloads on ID matches.
Add the missing declaration.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Tero Kristo <kristo@kernel.org>
Link: https://lkml.kernel.org/r/20210512033727.26701-1-cuibixuan@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 77429eebd9b1af516bf1b6898e63b098ed748374 upstream.
Fixes: 10590a9d4f23 ("EDAC/igen6: Add EDAC driver for Intel client SoCs using IBECC")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20201123031850.GA20416@aef56166e5fc
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Liwei Song <liwei.song@windriver.com>
|
|
commit 2223d8c781a0c1a8cf26b1d8f13aff84557ecbfc upstream.
Add debugfs support to fake memory correctable errors to test the
error reporting path and the error address decoding logic in the
igen6_edac driver.
Please note that the fake errors are also reported to EDAC core and
then the CE counter in EDAC sysfs is also increased.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Liwei Song <liwei.song@windriver.com>
|
|
commit 10590a9d4f23e0a519730d79d39331df60ad2079 upstream.
This driver supports Intel client SoC with integrated memory controller
using In-Band ECC(IBECC). The memory correctable and uncorrectable errors
are reported via NMIs. The driver handles the NMIs and decodes the memory
error address to platform specific address. The first IBECC-supported SoC
is Elkhart Lake.
[Tony: s/#include <linux/nmi.h>/#include <asm/nmi.h>/ to fix randconfig build]
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Liwei Song <liwei.song@windriver.com>
|
|
commit 3b20369313a486246582c8ef6ff5d1d0b9c34613 upstream.
There are {Low-Power DDR3/4, WIO2} types of memory.
Add new entries to 'enum mem_type' and new strings to
'edac_mem_types[]' for the new types.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Liwei Song <liwei.song@windriver.com>
|
|
|
|
commit 706657b1febf446a9ba37dc51b89f46604f57ee9 upstream.
In order to setup its PCI component, the driver needs any node private
instance in order to get a reference to the PCI device and hand that
into edac_pci_create_generic_ctl(). For convenience, it uses the 0th
memory controller descriptor under the assumption that if any, the 0th
will be always present.
However, this assumption goes wrong when the 0th node doesn't have
memory and the driver doesn't initialize an instance for it:
EDAC amd64: F17h detected (node 0).
...
EDAC amd64: Node 0: No DIMMs detected.
But looking up node instances is not really needed - all one needs is
the pointer to the proper device which gets discovered during instance
init.
So stash that pointer into a variable and use it when setting up the
EDAC PCI component.
Clear that variable when the driver needs to unwind due to some
instances failing init to avoid any registration imbalance.
Cc: <stable@vger.kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201122150815.13808-1-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 83ff51c4e3fecf6b8587ce4d46f6eac59f5d7c5a upstream.
Instead of raw access, use readl() to access MMIO registers of
memory controller to avoid possible compiler re-ordering.
Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Cc: <stable@vger.kernel.org>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8de0c9917cc1297bc5543b61992d5bdee4ce621a ]
The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.
In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().
However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact.
But the NUMA node configuration can be adjusted with optional memory
interleaving modes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.
Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the
physical ID is used.
Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20201109210659.754018-4-Yazen.Ghannam@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 3e0fdec858d82c829774f271e88b5ceb17051551 upstream.
... because no one should be interested in spurious MCEs anyway. Make
the filtering unconditional and move it to amd_filter_mce().
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200407163414.18058-2-bp@alien8.de
Signed-off-by: Liwei Song <liwei.song@windriver.com>
|
|
|
|
[ Upstream commit 66077adb70a2a9e92540155b2ace33ec98299c90 ]
platform_get_irq() returns a negative error number on error. In such a
case, comparison to 0 would pass the check therefore check the return
value properly, whether it is negative.
[ bp: Massage commit message. ]
Fixes: 86a18ee21e5e ("EDAC, ti: Add support for TI keystone and DRA7xx EDAC")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tero Kristo <t-kristo@ti.com>
Link: https://lkml.kernel.org/r/20200827070743.26628-2-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit afce6996943be265fa39240b67025cfcb1bcdfb1 ]
platform_get_irq() returns a negative error number on error. In such a
case, comparison to 0 would pass the check therefore check the return
value properly, whether it is negative.
[ bp: Massage commit message. ]
Fixes: 9b7e6242ee4e ("EDAC, aspeed: Add an Aspeed AST2500 EDAC driver")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Stefan Schaeckeler <schaecsn@gmx.net>
Link: https://lkml.kernel.org/r/20200827070743.26628-1-krzk@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 857a3139bd8be4f702c030c8ca06f3fd69c1741a ]
When pci_get_device_func() fails, the driver doesn't need to execute
pci_dev_put(). mci should still be freed, though, to prevent a memory
leak. When pci_enable_device() fails, the error injection PCI device
"einj" doesn't need to be disabled either.
[ bp: Massage commit message, rename label to "bail_mc_free". ]
Fixes: 52608ba205461 ("i5100_edac: probe for device 19 function 0")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200826121437.31606-1-dinghao.liu@zju.edu.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
|
|
[ Upstream commit 251c54ea26fa6029b01a76161a37a12fde5124e4 ]
With CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, a system would try to probe,
unregister and probe again a driver.
When ghes_edac is attempted to be loaded on a system which is not on
the safe platforms list, ghes_edac_register() would return early. The
unregister counterpart ghes_edac_unregister() would still attempt to
unregister and exit early at the refcount test, leading to the refcount
underflow below.
In order to not do *anything* on the unregister path too, reuse the
force_load parameter and check it on that path too, before fumbling with
the refcount.
ghes_edac: ghes_edac_register: entry
ghes_edac: ghes_edac_register: return -ENODEV
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 10 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xb9/0x100
Modules linked in:
CPU: 10 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #12
Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
RIP: 0010:refcount_warn_saturate+0xb9/0x100
Code: 82 e8 fb 8f 4d 00 90 0f 0b 90 90 c3 80 3d 55 4c f5 00 00 75 88 c6 05 4c 4c f5 00 01 90 48 c7 c7 d0 8a 10 82 e8 d8 8f 4d 00 90 <0f> 0b 90 90 c3 80 3d 30 4c f5 00 00 0f 85 61 ff ff ff c6 05 23 4c
RSP: 0018:ffffc90000037d58 EFLAGS: 00010292
RAX: 0000000000000026 RBX: ffff88840b8da000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff8216b24f RDI: 00000000ffffffff
RBP: ffff88840c662e00 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000046 R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88840ee80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000800002211000 CR4: 00000000003506e0
Call Trace:
ghes_edac_unregister
ghes_remove
platform_drv_remove
really_probe
driver_probe_device
device_driver_attach
__driver_attach
? device_driver_attach
? device_driver_attach
bus_for_each_dev
bus_add_driver
driver_register
? bert_init
ghes_init
do_one_initcall
? rcu_read_lock_sched_held
kernel_init_freeable
? rest_init
kernel_init
ret_from_fork
...
ghes_edac: ghes_edac_unregister: FALSE, refcount: -1073741824
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200911164950.GB19320@zn.tnic
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
|
|
[ Upstream commit 45bc6098a3e279d8e391d22428396687562797e2 ]
IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto
the stack as part of machine check delivery is valid or not.
Various drivers copied a code fragment that uses the RIPV bit to
determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED
or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where
RIPV is set as "FATAL").
Reverse the tests so that the error is marked fatal when RIPV is not set.
Reported-by: Gabriele Paoloni <gabriele.paoloni@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200707194324.14884-1-tony.luck@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f05390d30e20cccd8f8de981dee42bcdd8d2d137 ]
drivers/edac/skx_common.c: In function ‘skx_mce_output_error’:
drivers/edac/skx_common.c:478:8: warning: variable ‘type’ set but not used [-Wunused-but-set-variable]
478 | char *type, *optype;
| ^~~~
Acked-by: Borislav Petkov <bp@alien8.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|