aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/pci/pcie
AgeCommit message (Collapse)Author
2019-08-02PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratumStefan Mätje
commit 4ec73791a64bab25cabf16a6067ee478692e506d upstream. Due to an erratum in some Pericom PCIe-to-PCI bridges in reverse mode (conventional PCI on primary side, PCIe on downstream side), the Retrain Link bit needs to be cleared manually to allow the link training to complete successfully. If it is not cleared manually, the link training is continuously restarted and no devices below the PCI-to-PCIe bridge can be accessed. That means drivers for devices below the bridge will be loaded but won't work and may even crash because the driver is only reading 0xffff. See the Pericom Errata Sheet PI7C9X111SLB_errata_rev1.2_102711.pdf for details. Devices known as affected so far are: PI7C9X110, PI7C9X111SL, PI7C9X130. Add a new flag, clear_retrain_link, in struct pci_dev. Quirks for affected devices set this bit. Note that pcie_retrain_link() lives in aspm.c because that's currently the only place we use it, but this erratum is not specific to ASPM, and we may retrain links for other reasons in the future. Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu> [bhelgaas: apply regardless of CONFIG_PCIEASPM] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: stable@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02PCI: Factor out pcie_retrain_link() functionStefan Mätje
commit 86fa6a344209d9414ea962b1f1ac6ade9dd7563a upstream. Factor out pcie_retrain_link() to use for Pericom Retrain Link quirk. No functional change intended. Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: stable@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-04-12PCI/PME: Fix hotplug/sysfs remove deadlock in pcie_pme_remove()Rafael J. Wysocki
commit 95c80bc6952b6a5badc7b702d23e5bf14d251e7c upstream. Dongdong reported a deadlock triggered by a hotplug event during a sysfs "remove" operation: pciehp 0000:00:0c.0:pcie004: Slot(0-1): Link Up # echo 1 > 0000:00:0c.0/remove PME and hotplug share an MSI/MSI-X vector. The sysfs "remove" side is: remove_store pci_stop_and_remove_bus_device_locked pci_lock_rescan_remove pci_stop_and_remove_bus_device ... pcie_pme_remove pcie_pme_suspend synchronize_irq # wait for hotplug IRQ handler pci_unlock_rescan_remove The hotplug side is: pciehp_ist pciehp_handle_presence_or_link_change pciehp_configure_device pci_lock_rescan_remove # wait for pci_unlock_rescan_remove() INFO: task bash:10913 blocked for more than 120 seconds. # ps -ax |grep D PID TTY STAT TIME COMMAND 10913 ttyAMA0 Ds+ 0:00 -bash 14022 ? D 0:00 [irq/745-pciehp] # cat /proc/14022/stack __switch_to+0x94/0xd8 pci_lock_rescan_remove+0x20/0x28 pciehp_configure_device+0x30/0x140 pciehp_handle_presence_or_link_change+0x324/0x458 pciehp_ist+0x1dc/0x1e0 # cat /proc/10913/stack __switch_to+0x94/0xd8 synchronize_irq+0x8c/0xc0 pcie_pme_suspend+0xa4/0x118 pcie_pme_remove+0x20/0x40 pcie_port_remove_service+0x3c/0x58 ... pcie_port_device_remove+0x2c/0x48 pcie_portdrv_remove+0x68/0x78 pci_device_remove+0x48/0x120 ... pci_stop_bus_device+0x84/0xc0 pci_stop_and_remove_bus_device_locked+0x24/0x40 remove_store+0xa4/0xb8 dev_attr_store+0x44/0x60 sysfs_kf_write+0x58/0x80 It is incorrect to call pcie_pme_suspend() from pcie_pme_remove() for two reasons. First, pcie_pme_suspend() calls synchronize_irq(), which will wait for the native hotplug interrupt handler as well as for the PME one, because they share one IRQ (as per the spec). That may deadlock if hotplug is signaled while pcie_pme_remove() is running and the latter calls pci_lock_rescan_remove() before the former. Second, if pcie_pme_suspend() figures out that wakeup needs to be enabled for the port, it will return without disabling the interrupt as expected by pcie_pme_remove() which was overlooked by commit c7b5a4e6e8fb ("PCI / PM: Fix native PME handling during system suspend/resume"). To fix that, rework pcie_pme_remove() to disable the PME interrupt, clear its status and prevent the PME worker function from re-enabling it before calling free_irq() on it, which should be sufficient. Fixes: c7b5a4e6e8fb ("PCI / PM: Fix native PME handling during system suspend/resume") Link: https://lore.kernel.org/linux-pci/c7697e7c-e1af-13e4-8491-0a3996e6ab5d@huawei.com Reported-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [bhelgaas: add URL and deadlock details from Dongdong] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-01-30PCI/AER: Remove duplicate PCI_EXP_AER_FLAGS definitionBjorn Helgaas
commit 944d58595be02634cc295e341306ccda2365554d upstream. PCI_EXP_AER_FLAGS was defined twice (with identical definitions), once under #ifdef CONFIG_ACPI_APEI, and again at the top level. This looks like my merge error from these commits: fd3362cb73de ("PCI/AER: Squash aerdrv_core.c into aerdrv.c") 41cbc9eb1a82 ("PCI/AER: Squash ecrc.c into aerdrv.c") Remove the duplicate PCI_EXP_AER_FLAGS definition. Fixes: 41cbc9eb1a82 ("PCI/AER: Squash ecrc.c into aerdrv.c") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2018-11-13PCI/ASPM: Fix link_state teardown on device removalLukas Wunner
commit aeae4f3e5c38d47bdaef50446dc0ec857307df68 upstream. Upon removal of the last device on a bus, the link_state of the bridge leading to that bus is sought to be torn down by having pci_stop_dev() call pcie_aspm_exit_link_state(). When ASPM was originally introduced by commit 7d715a6c1ae5 ("PCI: add PCI Express ASPM support"), it determined whether the device being removed is the last one by calling list_empty() on the bridge's subordinate devices list. That didn't work because the device is only removed from the list slightly later in pci_destroy_dev(). Commit 3419c75e15f8 ("PCI: properly clean up ASPM link state on device remove") attempted to fix it by calling list_is_last(), but that's not correct either because it checks whether the device is at the *end* of the list, not whether it's the last one *left* in the list. If the user removes the device which happens to be at the end of the list via sysfs but other devices are preceding the device in the list, the link_state is torn down prematurely. The real fix is to move the invocation of pcie_aspm_exit_link_state() to pci_destroy_dev() and reinstate the call to list_empty(). Remove a duplicate check for dev->bus->self because pcie_aspm_exit_link_state() already contains an identical check. Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: stable@vger.kernel.org # v2.6.26 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-26PCI/AER: Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRSTAlexandru Gagniuc
[ Upstream commit 7af02fcd84c16801958936f88b848944c726ca07 ] According to the documentation, "pcie_ports=native", linux should use native AER and DPC services. While that is true for the _OSC method parsing, this is not the only place that is checked. Should the HEST list PCIe ports as firmware-first, linux will not use native services. This happens because aer_acpi_firmware_first() doesn't take 'pcie_ports' into account. This is wrong. DPC uses the same logic when it decides whether to load or not, so fixing this also fixes DPC not loading. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: return "false" from bool function (from kbuild robot)] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-26PCI/AER: Work around use-after-free in pcie_do_fatal_recovery()Thomas Tai
When an fatal error is received by a non-bridge device, the device is removed, and pci_stop_and_remove_bus_device() deallocates the device structure. The freed device structure is used by subsequent code to send uevents and print messages. Hold a reference on the device until we're finished using it. This is not an ideal fix because pcie_do_fatal_recovery() should not use the device at all after removing it, but that's too big a project for right now. Fixes: 7e9084b36740 ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices") Signed-off-by: Thomas Tai <thomas.tai@oracle.com> [bhelgaas: changelog, reduce get/put coverage] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-11PCI/AER: Use "PCI Express" consistently in Kconfig textBjorn Helgaas
Use "PCI Express" consistently in Kconfig text. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/Bjorn Helgaas
Hoist aerdrv.c, aer_inject.c up to drivers/pci/pcie/ so they're next to other PCIe service drivers. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Squash Kconfig.debug into KconfigBjorn Helgaas
Squash Kconfig.debug into Kconfig. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Move private AER things to aerdrv.cBjorn Helgaas
Most of the things in aerdrv.h are only used in aerdrv.c, so move them there. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Move aer_irq() declaration to portdrv.hBjorn Helgaas
The aer_irq() declaration is the only thing needed by aer_inject.c. Move it to portdrv.h so we eventually get rid of aerdrv.h completely. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Move pcie_aer_get_firmware_first() to portdrv.hBjorn Helgaas
Move pcie_aer_get_firmware_first() to portdrv.h, where it can be more easily shared between AER and DPC. Then DPC no longer needs to include aer/aerdrv.h. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Remove duplicate pcie_port_bus_type declarationBjorn Helgaas
pcie_port_bus_type is already declared in portdrv.h, so remove the unnecessary duplicate declaration in aerdrv.h. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Squash ecrc.c into aerdrv.cBjorn Helgaas
Squash ecrc.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Squash aerdrv_acpi.c into aerdrv.cBjorn Helgaas
Squash aerdrv_acpi.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Squash aerdrv_errprint.c into aerdrv.cBjorn Helgaas
Squash aerdrv_errprint.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Squash aerdrv_core.c into aerdrv.cBjorn Helgaas
Squash aerdrv_core.c into aerdrv.c. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-11PCI/AER: Reorder code to group probe/remove stuff togetherBjorn Helgaas
Reorder code to group probe/remove stuff together. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-08PCI/AER: Remove forward declarationsBjorn Helgaas
Reorder code to remove forward declarations. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-06Merge branch 'pci/portdrv'Bjorn Helgaas
- remove unused pcie_port_acpi_setup() and portdrv_acpi.c (Bjorn Helgaas) * pci/portdrv: PCI/portdrv: Remove unused pcie_port_acpi_setup()
2018-06-06Merge branch 'pci/hotplug'Bjorn Helgaas
- fix use-before-set error in ibmphp (Dan Carpenter) - fix pciehp timeouts caused by Command Completed errata (Bjorn Helgaas) - fix refcounting in pnv_php hotplug (Julia Lawall) - clear pciehp Presence Detect and Data Link Layer Status Changed on resume so we don't miss hotplug events (Mika Westerberg) - only request pciehp control if we support it, so platform can use ACPI hotplug otherwise (Mika Westerberg) - convert SHPC to be builtin only (Mika Westerberg) - request SHPC control via _OSC if we support it (Mika Westerberg) - simplify SHPC handoff from firmware (Mika Westerberg) * pci/hotplug: PCI: Improve "partially hidden behind bridge" log message PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc PCI: Move resource distribution for single bridge outside loop PCI: Account for all bridges on bus when distributing bus numbers ACPI / hotplug / PCI: Drop unnecessary parentheses ACPI / hotplug / PCI: Mark stale PCI devices disconnected ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug PCI: hotplug: Add hotplug_is_native() PCI: shpchp: Add shpchp_is_native() PCI: shpchp: Fix AMD POGO identification PCI: shpchp: Use dev_printk() for OSHP-related messages PCI: shpchp: Remove get_hp_hw_control_from_firmware() wrapper PCI: shpchp: Remove acpi_get_hp_hw_control_from_firmware() flags PCI: shpchp: Rely on previous _OSC results PCI: shpchp: Request SHPC control via _OSC when adding host bridge PCI: shpchp: Convert SHPC to be builtin only PCI: pciehp: Make pciehp_is_native() stricter PCI: pciehp: Rename host->native_hotplug to host->native_pcie_hotplug PCI: pciehp: Request control of native hotplug only if supported PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume PCI: pnv_php: Add missing of_node_put() PCI: pciehp: Add quirk for Command Completed errata PCI: Add Qualcomm vendor ID PCI: ibmphp: Fix use-before-set in get_max_bus_speed() # Conflicts: # drivers/acpi/pci_root.c
2018-06-06Merge branch 'pci/dpc'Bjorn Helgaas
- clear interrupt status in top half to avoid interrupt storm (Oza Pawandeep) * pci/dpc: PCI/DPC: Clear interrupt status in interrupt handler top half
2018-06-06Merge branch 'pci/aspm'Bjorn Helgaas
- disable ASPM L1.2 substate if we don't have LTR (Bjorn Helgaas) - respect platform ownership of LTR (Bjorn Helgaas) * pci/aspm: PCI/ACPI: Request LTR control from platform before using it PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR
2018-06-05PCI/AER: Replace struct pcie_device with pci_devKeith Busch
The AER driver only needed the pcie_device to get to the port pci_dev. Save the pci_dev pointer directly in struct aer_rpc and remove the unnecessary indirection. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-05PCI/AER: Remove unused parametersKeith Busch
Remove unused "struct pcie_device *" parameters to handle_error_source() and aer_process_err_devices(). No functional change intended. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-02PCI/AER: Decode Error Source Requester IDBjorn Helgaas
Decode the Requester ID from the AER Error Source Register into domain/ bus/device/function format to match other logging. In cases where the ID matches the device used for pci_err(), drop the extra ID completely so we don't print it twice. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-02PCI/AER: Remove aer_recover_work_func() forward declarationBorislav Petkov
Just move the actual function up so that it is visible to its user aer_recover_queue(). No functional changes. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-02PCI/DPC: Use the generic pcie_do_fatal_recovery() pathOza Pawandeep
Our goal is to handle ERR_FATAL errors similarly, whether they are reported via AER or via DPC. A previous commit changed AER so it handles ERR_FATAL by calling driver .remove() methods and resetting the Link. DPC already does that (although the Link reset is done automatically by hardware and happens before we call the driver .remove() methods). Restructure the DPC code so it calls the same pcie_do_fatal_recovery() interface used by AER. This makes it clearer that we want to use the same path. Implement the .reset_link() method used by pcie_do_fatal_recovery(). For DPC, the actual reset is done automatically by hardware, so we really only have to wait for the Link to be inactive, then release the Port from DPC. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, DPC_FATAL is not a bitfield, can be sequential] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-02PCI/AER: Pass service type to pcie_do_fatal_recovery()Oza Pawandeep
Pass the service type to pcie_do_fatal_recovery() instead of assuming AER. We will make DPC also use pcie_do_fatal_recovery(), and it needs to do things a little differently for AER and DPC. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-02PCI/DPC: Disable ERR_NONFATAL handling by DPCOza Pawandeep
PCIe ERR_NONFATAL errors mean a particular transaction is unreliable but the Link is otherwise fully functional (PCIe r4.0, sec 6.2.2). The AER driver handles these by logging the error details and calling driver-supplied pci_error_handlers callbacks. It does not reset downstream devices, does not remove them from the PCI subsystem, does not re-enumerate them, and does not call their driver .remove() or .probe() methods. But DPC driver previously enabled DPC on ERR_NONFATAL, so if the hardware supports DPC, these errors caused a Link reset (performed automatically by the hardware), followed by the DPC driver removing affected devices (which calls their .remove() methods), bringing the Link back up, and re-enumerating (which calls driver .probe() methods). Disable ERR_NONFATAL DPC triggering so these errors will only be handled by AER. This means drivers won't have to deal with different usage of their pci_error_handlers callbacks and .probe() and .remove() methods based on whether the platform has DPC support. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-06-02PCI/portdrv: Add generic pcie_port_find_device()Oza Pawandeep
Add generic pcie_port_find_device() routine. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-06-02PCI: pciehp: Make pciehp_is_native() stricterMika Westerberg
Previously pciehp_is_native() returned true for any PCI device in a hierarchy where _OSC says we can use pciehp. This is incorrect because bridges without PCI_EXP_SLTCAP_HPC capability should be managed by acpiphp instead. Improve pciehp_is_native() to return true only when PCI_EXP_SLTCAP_HPC is set and the pciehp driver is present. In any other case return false to let acpiphp handle those. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: remove NULL pointer check] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-06-02PCI: pciehp: Rename host->native_hotplug to host->native_pcie_hotplugMika Westerberg
Rename host->native_hotplug to host->native_pcie_hotplug to make room for a similar flag for SHPC hotplug. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-17PCI/portdrv: Add generic pcie_port_find_service()Oza Pawandeep
Add generic pcie_port_find_service() routine. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-17PCI/AER: Factor out error reporting to drivers/pci/pcie/err.cOza Pawandeep
Move the error reporting callbacks from aerdrv_core.c to err.c, where they can be used by DPC in addition to AER. As part of aerdrv_core.c, these callbacks were built under CONFIG_PCIEAER. Moving them to the new err.c means they will now be built under CONFIG_PCIEPORTBUS, so adjust the definition of pci_uevent_ers() to match. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: in reset_link(), initialize "driver" even if CONFIG_PCIEAER is unset, update pci_uevent_ers() #ifdef wrapper] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-17PCI/AER: Rename error recovery interfaces to generic PCI namingOza Pawandeep
Rename error recovery interfaces with "pcie_" prefix so they can be made non-static. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: move declaration to later patch, leave functions static] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-17PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devicesOza Pawandeep
PCIe ERR_FATAL errors mean the Link is unreliable. Components on the Link may need to be reset to return to reliable operation (PCIe r4.0, sec 6.2.2). We previously handled these errors much differently depending on whether the platform supports Downstream Port Containment (DPC) (PCIe r4.0, sec 6.2.10) or not. The AER driver has historically logged the error details, called driver-supplied pci_error_handlers callbacks, and reset the Link. This reset downstream devices, but did not remove them from the PCI subsystem, re-enumerate them, or call their driver .remove() or .probe() methods. DPC is different because the hardware automatically disables the Link when it detects ERR_FATAL, which resets downstream devices. There's no opportunity for pci_error_handlers callbacks before resetting the Link. The DPC driver removes affected devices (which calls their driver .remove() methods), brings the Link back up, and re-enumerates (which calls driver .probe() methods). Align AER ERR_FATAL handling with DPC by resetting the Link in software, skipping the driver pci_error_handlers callbacks, removing the devices from the PCI subsystem, and re-enumerating. The idea is that drivers and devices should see the same behavior for ERR_FATAL events, regardless of whether they're handled by AER or DPC. Here are the basic ERR_FATAL recovery steps, showing the previous AER behavior, the AER behavior after this patch, and the DPC behavior: AER AER DPC previous new behavior -------- --- -------- Log error yes yes yes (minimal) drv.error_detected() yes no no Reset Link yes yes yes drv.mmio_enabled() yes no no drv.slot_reset() yes no no drv.resume() yes no no Remove PCI devices no yes yes (calls drv.remove()) Re-enumerate no yes yes (calls drv.probe()) N.B. With DPC, the Link reset happens before the driver .remove() calls, while with AER, the reset happens *after* the .remove() calls. The goal is to eventually do the reset before .remove() for AER as well. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, squash doc patch into this, remove unused "result_data"] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-17PCI: Add generic pcie_wait_for_link() interfaceOza Pawandeep
Clients such as hotplug and Downstream Port Containment (DPC) both need to wait until a link becomes active or inactive. Add a generic pcie_wait_link_active() interface and use it instead of duplicating the code. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-16PCI/DPC: Clear interrupt status in interrupt handler top halfOza Pawandeep
The generic IRQ handling code ensures that an interrupt handler runs with its interrupt masked or disabled. If the interrupt is level-triggered, the interrupt handler must tell its device to stop asserting the interrupt before returning. If it doesn't, we will immediately take the interrupt again when the handler returns and the generic code unmasks the interrupt. The driver doesn't know whether its interrupt is edge- or level-triggered, so it must clear its interrupt source directly in its interrupt handler. Previously we cleared the DPC interrupt status in the bottom half, i.e., in deferred work, which can cause an interrupt storm if the DPC interrupt happens to be level-triggered, e.g., if we're using INTx instead of MSI. Clear the DPC interrupt status bit in the interrupt handler, not in the deferred work. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-10PCI/AER: Add TLP header information to tracepointThomas Tai
When a PCIe AER error occurs, the TLP header information is printed in the kernel message but it is missing from the tracepoint. A userspace program can use this information in the tracepoint to better analyze problems. To enable the tracepoint: echo 1 > /sys/kernel/debug/tracing/events/ras/aer_event/enable Example tracepoint output: $ cat /sys/kernel/debug/tracing/trace aer_event: 0000:01:00.0 PCIe Bus Error: severity=Uncorrected, non-fatal, Completer Abort TLP Header={0x0,0x1,0x2,0x3} Signed-off-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-05-07PCI/AER: Unify error bit printing for native and CPER reportingAlexandru Gagniuc
AER errors can be reported natively (Linux AER driver fields interrupts and reads error state directly from hardware) or via the ACPI/APEI/GHES/CPER path (platform firmware reads error state from hardware and sends it to Linux via ACPI interfaces). Previously the same error would produce different output depending on whether it was reported natively or via ACPI. The CPER path resulted in hard-to-understand messages, without a prefix. Instead use __aer_print_error() for both native AER and CPER to provide a more consistent log format. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-05-02PCI/portdrv: Remove unused pcie_port_acpi_setup()Bjorn Helgaas
02bfeb484230 ("PCI/portdrv: Simplify PCIe feature permission checking") removed the only call of pcie_port_acpi_setup() and removed portdrv_acpi.o from the Makefile, but I forgot to remove pcie_port_acpi_setup() itself. Remove pcie_port_acpi_setup() and the drivers/pci/pcie/portdrv_acpi.c file. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-04-18PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTRBjorn Helgaas
When in the ASPM L1.0 state (but not the PCI-PM L1.0 state), the most recent LTR value and the LTR_L1.2_THRESHOLD determines whether the link enters the L1.2 substate. If we don't have LTR enabled, prevent the use of ASPM L1.2. PCI-PM L1.2 may still be used because it doesn't depend on LTR_L1.2_THRESHOLD (see PCIe r4.0, sec 5.5.1). Tested-by: Srinath Mannam <srinath.mannam@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-04-04Merge branch 'pci/portdrv'Bjorn Helgaas
- move pcieport_if.h to drivers/pci/pcie/ to encapsulate it (Frederick Lawler) - merge pcieport_if.h into portdrv.h (Bjorn Helgaas) - move workaround for BIOS PME issue from portdrv to PCI core (Bjorn Helgaas) - completely disable portdrv with "pcie_ports=compat" (Bjorn Helgaas) - remove portdrv link order dependency (Bjorn Helgaas) - remove support for unused VC portdrv service (Bjorn Helgaas) - simplify portdrv feature permission checking (Bjorn Helgaas) - remove "pcie_hp=nomsi" parameter (use "pci=nomsi" instead) (Bjorn Helgaas) - remove unnecessary "pcie_ports=auto" parameter (Bjorn Helgaas) - use cached AER capability offset (Frederick Lawler) - don't enable DPC if BIOS hasn't granted AER control (Mika Westerberg) - rename pcie-dpc.c to dpc.c (Bjorn Helgaas) * pci/portdrv: PCI/DPC: Rename from pcie-dpc.c to dpc.c PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOS PCI/AER: Use cached AER Capability offset PCI/portdrv: Rename and reverse sense of pcie_ports_auto PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h> PCI/portdrv: Simplify PCIe feature permission checking PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC PCI/portdrv: Remove pcie_port_bus_type link order dependency PCI/portdrv: Disable port driver in compat mode PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver PCI/PM: Move pcie_clear_root_pme_status() to core PCI/portdrv: Merge pcieport_if.h into portdrv.h PCI/portdrv: Move pcieport_if.h to drivers/pci/pcie/ Conflicts: drivers/pci/pcie/Makefile drivers/pci/pcie/portdrv.h
2018-04-04Merge branch 'pci/misc'Bjorn Helgaas
- use PCI_EXP_DEVCTL2_COMP_TIMEOUT in rapidio/tsi721 (Bjorn Helgaas) - remove possible NULL pointer dereference in of_pci_bus_find_domain_nr() (Shawn Lin) - report quirk timings with dev_info (Bjorn Helgaas) - report quirks that take longer than 10ms (Bjorn Helgaas) - add and use Altera Vendor ID (Johannes Thumshirn) - tidy Makefiles and comments (Bjorn Helgaas) * pci/misc: PCI: Always define the of_node helpers PCI: Tidy comments PCI: Tidy Makefiles mcb: Add Altera PCI ID to mcb-pci PCI: Add Altera vendor ID PCI: Report quirks that take more than 10ms PCI: Report quirk timings with pci_info() instead of pr_debug() PCI: Fix NULL pointer dereference in of_pci_bus_find_domain_nr() rapidio/tsi721: use PCI_EXP_DEVCTL2_COMP_TIMEOUT macro
2018-03-31PCI/DPC: Rename from pcie-dpc.c to dpc.cBjorn Helgaas
Rename pcie-dpc.c to dpc.c. The path "drivers/pci/pcie/pcie-dpc.c" has more occurrences of "pci" than necessary. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-30PCI/DPC: Do not enable DPC if AER control is not allowed by the BIOSMika Westerberg
Commit eed85ff4c0da ("PCI/DPC: Enable DPC only if AER is available") made DPC control dependent whether AER is enabled in the OS. However, it does not take into account situations where BIOS has not given OS control of AER: acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI] acpi PNP0A08:00: _OSC: platform does not support [AER] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME PCIeCapability] I think here it is better not to enable DPC even if the capability is available because then it would be against what "Determination of DPC Control" note in PCIe 4.0 sec 6.1.10 recommends. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
2018-03-30PCI/AER: Use cached AER Capability offsetFrederick Lawler
Replace pci_find_ext_capability(..., PCI_EXT_CAP_ID_ERR) calls with pci_dev->aer_cap. pci_dev->aer_cap is initialized in pci_init_capabilities(), which happens before any of these users of the AER Capability. Signed-off-by: Frederick Lawler <fred@fredlawl.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-30PCI/portdrv: Rename and reverse sense of pcie_ports_autoBjorn Helgaas
The platform may restrict the OS's use of PCIe services, e.g., via the ACPI _OSC method. The user may use "pcie_ports=native" to force the port driver to use PCIe services even if the platform asked us not to. The "pcie_ports=native" parameter determines the setting of pcie_ports_auto. Rename this to pcie_ports_native and reverse the sense to simplify the code. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>