summaryrefslogtreecommitdiffstats
path: root/drivers/edac
AgeCommit message (Collapse)Author
2017-09-25EDAC: Add owner check to the x86 platform driversToshi Kani
Change x86 EDAC platform drivers to verify the module owner at the beginning of their module init functions. This allows them to fail their init immediately when ghes_edac is enabled. Similar change can be made to other edac drivers if necessary. Also, remove ".c" from module names of pnp2_edac, sb_edac, and skx_edac. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-6-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25EDAC: Add helper which returns the loaded platform driverToshi Kani
Only a single EDAC platform driver can be loaded. When ghes_edac is enabled, an EDAC platform driver still attempts to register itself and fails in edac_mc_add_mc(). Add edac_get_owner() so that EDAC platform drivers can check the owner first. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-5-toshi.kani@hpe.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25EDAC, ghes: Add platform checkToshi Kani
The ghes_edac driver was introduced in 2013 [1], but it has not been enabled by any distro yet. This driver obtains error info from firmware interfaces (APEI), which are not properly implemented on many platforms, as the driver says on load: This EDAC driver relies on BIOS to enumerate memory and get error reports. Unfortunately, not all BIOSes reflect the memory layout correctly. So, the end result of using this driver varies from vendor to vendor. If you find incorrect reports, please contact your hardware vendor to correct its BIOS. To get out from this situation, add a platform check to selectively enable the driver on platforms that are known to have proper APEI firmware implementation. "ghes_edac.force_load=1" skips this platform check. [1]: https://lkml.kernel.org/r/cover.1360931635.git.mchehab@redhat.com Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-acpi@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170823225447.15608-4-toshi.kani@hpe.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25EDAC, ghes: Model a single, logical memory controllerBorislav Petkov
We're enumerating the DIMMs through a DMI walk and since we can't get any more detailed topological information about which DIMMs belong to which memory controller, convert it to a single, logical controller which contains all the DIMMs. The error reporting path from GHES ghes_edac_report_mem_error() doesn't get called in NMI context but add a warning about it to catch any changes in the future as if so, our locking scheme will be insufficient then. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-25EDAC, ghes: Remove symbol exportsBorislav Petkov
They're called from builtin code so no need for the exports. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-09-21EDAC: Handle return value of kasprintf()Arvind Yadav
kasprintf() can fail and we must check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: linux-edac@vger.kernel.org [ Merged into a single patch, small formatting fixups. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-21EDAC, mce_amd: Get rid of local var in amd_filter_mce()Borislav Petkov
... and use the macro for that. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-21EDAC, mce_amd: Get rid of most struct cpuinfo_x86 usesBorislav Petkov
struct mce.cpuid contains CPUID(1).EAX which contains family, model and stepping and thus has enough information for our purposes. Thus get rid of some external dependencies which are not really needed. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-21EDAC, mce_amd: Rename decode_smca_errors() to decode_smca_error()Borislav Petkov
Singular fits better because it decodes a single error. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-20EDAC: Make device_type constBhumika Goyal
Make these const as they are only stored in the type field of a device structure, which is const. Done using Coccinelle. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1503130946-2854-2-git-send-email-bhumirks@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-19EDAC, pnd2: Properly toggle hidden state for P2SB PCI deviceQiuxu Zhuo
Properly handle hidden state of P2SB PCI device (DEV:D, FUN:0) for Apollo Lake. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170814154905.21707-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-19EDAC, pnd2: Conditionally unhide/hide the P2SB PCI device to read BARQiuxu Zhuo
On Deverton server, the P2SB PCI device (DEV:1F, FUN:1) is used by multiple device drivers. If it's hidden by some device driver (e.g. with the i801 I2C driver, the commit 9424693035a5 ("i2c: i801: Create iTCO device on newer Intel PCHs") unconditionally hid the P2SB PCI device wrongly) it will make the pnd2_edac driver read out an invalid BAR value of 0xffffffff and then fail on ioremap(). Therefore, store the presence state of P2SB PCI device before unhiding it for reading BAR and restore the presence state after reading BAR. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-i2c@vger.kernel.org Link: http://lkml.kernel.org/r/20170814154845.21663-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-19EDAC, pnd2: Mask off the lower four bits of a BARQiuxu Zhuo
Bit[0] of BAR is always zero. Bit[2:1] and bit[3] of BAR contain the information of 'type' and the 'prefetchable' accordingly. Therefore, mask the lower four bits to retrieve the actual base address of a BAR. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170814154813.21619-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-18EDAC, thunderx: Fix error handling path in thunderx_lmc_probe()Christophe JAILLET
Return the proper error value if ioremap() fails (and not 0). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: David Daney <david.daney@cavium.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20170816045821.14165-1-christophe.jaillet@wanadoo.fr [ Massage commit message, remove newline. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-18EDAC, altera: Fix error handling path in altr_edac_device_probe()Christophe JAILLET
Return the proper error value if devm_ioremap() fails (and not 0). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170816050506.14541-1-christophe.jaillet@wanadoo.fr [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-04EDAC, pnd2: Build in a minimal sideband driver for Apollo LakeTony Luck
I've been waing a long time for the generic sideband driver to appear. Patience has run out, so include the minimum here to just read registers. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Aristeu Rozanski <arozansk@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Patrick Geary <patrickg@supermicro.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170803210536.5662-1-tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-08-02EDAC, sb_edac: Classify memory mirroring modesQiuxu Zhuo
Basically, there are full memory mirroring and address range partial memory mirroring (supported by Haswell EX and Broadwell EX) modes. a) In full memory mirroring, the memory behind each memory controller is mirrored, i.e. the memory is split into two identical mirrors (primary and secondary), half of the memory is reserved for redundancy. b) In address range partial memory mirroring, the memory size (range) of primary and secondary behind each memory controller can be user defined by the TAD0 register. The rest of memory ranges defined by TAD1/TAD2/... in that memory controller are non-mirrored. For more detail on memory mirroring, see the following link written by Tony Luck: https://01.org/lkp/blogs/tonyluck/2016/address-range-partial-memory-mirroring-linux Currently the sb_edac driver only supports address decoding in full memory mirroring and non-mirroring modes. In address range partial memory mirroring mode, it may fail to decode an address that falls in a non-mirroring area (the following was one of this kind of failed logs). mce: Uncorrected hardware memory error in user-access at 566d53a400 Memory failure: 0x566d53a: Killing einj_mem_uc:4647 due to hardware memory corruption Memory failure: 0x566d53a: recovery action for dirty LRU page: Recovered mce: [Hardware Error]: Machine check events logged EDAC sbridge MC1: HANDLING MCE MEMORY ERROR EDAC sbridge MC1: CPU 48: Machine Check Event: 0 Bank 7: ec00000000010090 EDAC sbridge MC1: TSC 4b914aa5a99dab EDAC sbridge MC1: ADDR 566d53a400 EDAC sbridge MC1: MISC 1443a0c86 EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1499712764 SOCKET 2 APIC 80 EDAC MC1: 0 UE Can't discover the memory rank for ch addr 0x7fb54e900 on any memory ( page:0x0 offset:0x0 grain:32) mce: [Hardware Error]: Machine check events logged Therefore, classify memory mirroring modes and make the address decoding in address range partial memory mode correct. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170730180651.30060-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-19EDAC, cpc925, ppc4xx: Convert to using %pOF instead of full_nameRob Herring
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: devicetree@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170718214339.7774-19-robh@kernel.org Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-17EDAC: Get rid of mci->mod_verBorislav Petkov
It is a write-only variable so get rid of it. Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Robert Richter <rric@kernel.org> Acked-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Thor Thayer <thor.thayer@linux.intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Mark Gross <mark.gross@intel.com> Cc: Tim Small <tim@buttersideup.com> Cc: Ranganathan Desikan <ravi@jetztechnologies.com> Cc: "Arvind R." <arvino55@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: "Sören Brinkmann" <soren.brinkmann@xilinx.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Daney <david.daney@cavium.com> Cc: Loc Ho <lho@apm.com> Cc: linux-edac@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@linux-mips.org
2017-07-17EDAC: Constify attribute_group structuresArvind Yadav
attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> CC: linux-edac@vger.kernel.org Link: http://lkml.kernel.org/r/776cb8265509054abd01b0b551624cc0da3b88e7.1499078335.git.arvind.yadav.cs@gmail.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-07-17EDAC, mce_amd: Use cpu_to_node() to find the node IDYazen Ghannam
Using the homegrown amd_get_nb_id() to find a node ID on AMD was fine while the L3 to node mapping was 1:1. And Zen topology broke this. So let's start slowly moving away from it and use the topology interfaces instead. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/1490041614-90057-2-git-send-email-Yazen.Ghannam@amd.com [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-29EDAC, pnd2: Fix Apollo Lake DIMM detectionTony Luck
Non-existent or empty DIMM slots result in error return from RD_REGP(). But we shouldn't give up on failure. So long as we find at least one DIMM we can continue. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170628234407.21521-1-tony.luck@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-29EDAC, i5000, i5400: Fix definition of NRECMEMB registerJérémy Lefaure
In the i5000 and i5400 drivers, the NRECMEMB register is defined as a 16-bit value, which results in wrong shifts in the code, as reported by sparse. In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21), this register is a 32-bit register. A u32 value for the register fixes the wrong shifts warnings and matches the datasheet. Also fix the mask to access to the CAS bits [27:16] in the i5000 driver. [1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf [2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-26EDAC, pnd2: Make function sbi_send() staticColin Ian King
The function sbi_send() is local to just pnd2_edac.c and does not need to be in global scope, so make it static. Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170623084855.9197-1-colin.king@canonical.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-23EDAC, pnd2: Return proper error value from apl_rd_reg()Gustavo A. R. Silva
Add code comment to make it clear that the fall-through is intentional and, OR ret with its previous value to avoid overwriting it so that callers can check the correct return value. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170622220535.GA4896@embeddedgus [ Massage a bit. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-14EDAC, altera: Simplify calculation of total memoryChris Packham
Use of_address_to_resource() and resource_size() instead of manually parsing the "reg" property from the "memory" node(s). Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170606235500.22772-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-14EDAC, sb_edac: Avoid creating SOCK memory controllerQiuxu Zhuo
Xiaolong Ye reported the following failure on Broadwell D server: EDAC sbridge: Some needed devices are missing EDAC MC: Removed device 0 for sbridge_edac.c Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0 EDAC sbridge: Couldn't find mci handler EDAC sbridge: Failed to register device with error -19. Broadwell D (only IMC0 per socket) and Broadwell X (IMC0 and IMC1 per socket) use the same PCI device IDs for IMC0 per socket, then they share pci_dev_descr_broadwell_table (n_imcs_per_sock=2). In this case, Broadwell D wrongly creates the nonexistent SOCK EDAC memory controller and reports above error messages, since it has no IMC1 per socket. Avoid creating the nonexistent SOCK memory controller. Reported-and-tested-by: Xiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170608113351.25323-1-qiuxu.zhuo@intel.com [ Massage. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-12EDAC, mce_amd: Fix typo in SMCA error descriptionYazen Ghannam
Fix typo in "poison consumption" error description. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1497286703-62853-1-git-send-email-Yazen.Ghannam@amd.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-09EDAC, mv64x60: Sanity check edac_op_state before registeringChris Packham
edac_op_state is a module parameter which affects the behaviour of the driver probe which can potentially be invoked as soon as the platform driver registration happens. Because of this we need to ensure that we sanity check the module parameter before calling platform_register_drivers(). Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170607215530.8604-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>
2017-06-01EDAC, thunderx: Fix a warning during l2c debugfs node creationVadim Lomovtsev
Compare the number of debugfs entries created by thunderx_create_debugfs_nodes() with the requested number of entries to properly determine whether to print a warning. Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linux-mips@linux-mips.org Link: http://lkml.kernel.org/r/20170531155157.93583-1-stemerkhanov@cavium.com Signed-off-by: Sergey Temerkhanov <s.temerkhanov@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-30EDAC, mv64x60: Check driver registration successChris Packham
Check the return status of platform_driver_register() in mv64x60_edac_init(). Only output messages and initialise the edac_op_state if the registration is successful. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170529212142.25572-2-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-28EDAC, ie31200: Add Intel Kaby Lake CPU supportJason Baron
Kaby Lake seems to work just like Skylake. Reported-and-tested-by: Doug Thompson <bc.tdw@recursor.net> Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Tony Luck <tony.luck@intel.com Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1495823683-32569-1-git-send-email-jbaron@akamai.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-26EDAC, mv64x60: Replace in_le32()/out_le32() with readl()/writel()Chris Packham
To allow this driver to be used on non-powerpc platforms it needs to use io accessors suitable for all platforms. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170518083135.28048-4-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-26EDAC, mv64x60: Fix pdata->nameChris Packham
Change this from mpc85xx_pci_err to mv64x60_pci_err. The former is likely a hangover from when this driver was created. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20170518083135.28048-3-chris.packham@alliedtelesis.co.nz Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Bump driver version and do some cleanupsQiuxu Zhuo
Collapse 'case:' in *_mci_bind_devs() and update driver version from 1.1.1 to 1.1.2. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000934.87971-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Check if ECC enabled when at least one DIMM is presentQiuxu Zhuo
This is based on previous work by Patrick Geary, see Link. Additional cleanups ontop: - Remove the code to read MCMTR from pci_ha1_ta and CHN_TO_HA macro, now that TA0 and TA1 are unified. - Remove get_pdev_same_bus(), since in get_dimm_config() the variable "pvt->pci_ta" for KNL is also ready, we can simply use pci_read_config_dword(pvt->pci_ta, KNL_MCMTR, &pvt->info.mcmtr) to read MCMTR. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: https://lkml.kernel.org/r/57884350.1030401@supermicro.com Link: http://lkml.kernel.org/r/20170523000910.87925-1-qiuxu.zhuo@intel.com [ Make __populate_dimms() return int. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4Qiuxu Zhuo
We don't need this quirk anymore now that the EDAC memory controller representation matches the hardware. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000834.87881-1-qiuxu.zhuo@intel.com [ Commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Carve out dimm-populating loopBorislav Petkov
... to slim down get_dimm_config(). No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Fix mod_nameBorislav Petkov
It is called "sb_edac.c" now. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Assign EDAC memory controller per h/w controllerQiuxu Zhuo
Tony pointed out: "currently the driver pretends there is one big 8-channel memory controller per socket instead of 2 4-channel controllers. This is fine with all memory controller populated with symmetrical DIMM configurations, but runs into difficulties on asymmetrical setups". Restructure the driver to assign an EDAC memory controller to each real h/w memory controller to resolve the issue. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000731.87793-1-qiuxu.zhuo@intel.com [ Break some lines at convenient points. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Don't use "Socket#" in the memory controller nameTony Luck
EDAC assigns logical memory controller numbers in the order that we find memory controllers, which depends on which PCI bus they are on. Some systems end up with MC0 on socket0, others (e.g Haswell) have MC0 on socket3. All this is made more confusing for users because we use the string "Socket" while generating names for memory controllers, but the number that we attach there is the memory controller number. E.g. EDAC MC0: Giving out device to module sbridge_edac.c controller Haswell Socket#0: DEV 0000:ff:12.0 (INTERRUPT) Change the names to say "SrcID#%d" (where the number we use is read from the h/w associated with the memory controller instead of some logical number internal to the EDAC driver). New message: EDAC MC0: Giving out device to module sbridge_edac.c controller Haswell SrcID#3: DEV 0000:ff:12.0 (INTERRUPT) Reported-by: Andrey Korolyov <andrey@xdel.ru> Reported-by: Patrick Geary <patrickg@supermicro.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000603.87748-1-qiuxu.zhuo@intel.com Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-25EDAC, sb_edac: Classify PCI-IDs by topologyQiuxu Zhuo
Each of the PCI device IDs belongs to a CPU socket, or to one of the integrated memory controllers. Provide an enum to specify the domain of each, and distinguish the resource number in each domain: the number of the PCI device IDs per integrated memory controller/socket, and the number of integrated memory controllers per socket. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170523000533.87704-1-qiuxu.zhuo@intel.com [ Realign pci_dev_descr_knl members. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-24EDAC, altera: Constify irq_domain_opsTobias Klauser
struct irq_domain_ops is not modified, so it can be made const. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Cc: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20170524133505.1233-1-tklauser@distanz.ch Signed-off-by: Borislav Petkov <bp@suse.de>
2017-05-03EDAC, amd64: Fix reporting of Chip Select sizes on Fam17hYazen Ghannam
The wrong index into the csbases/csmasks arrays was being passed to the function to compute the chip select sizes, which resulted in the wrong size being computed. Address that so that the correct values are computed and printed. Also, redo how we calculate the number of pages in a CS row. Reported-by: Benjamin Bennett <benbennett@gmail.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: <stable@vger.kernel.org> # 4.10.x Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded integer math comment, minor cleanups. ] Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-27EDAC, ghes: Do not enable it by defaultBorislav Petkov
Leave it to the user to decide whether to enable this or not. Otherwise, platform-specific drivers won't initialize (currently, EDAC supports only a single platform driver loaded). Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Rename report status accessorsBorislav Petkov
Change them to have the edac_ prefix. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Delete edac_stub.cBorislav Petkov
Move the remaining functionality to edac_mc.c. Convert "edac_report=" to a module parameter. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Update Kconfig help textBorislav Petkov
Remove the old URLs. Signed-off-by: Borislav Petkov <bp@suse.de>
2017-04-10EDAC: Remove EDAC_MM_EDACBorislav Petkov
Move all the EDAC core functionality behind CONFIG_EDAC and get rid of that indirection. Update defconfigs which had it. While at it, fix dependencies such that EDAC depends on RAS for the tracepoints. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: linux-edac@vger.kernel.org
2017-04-10EDAC: Issue tracepoint only when it is definedBorislav Petkov
... and this happens only when CONFIG_RAS is enabled. Signed-off-by: Borislav Petkov <bp@suse.de>