xref: /OK3568_Linux_fs/kernel/Documentation/PCI/boot-interrupts.rst (revision 4882a59341e53eb6f0b4789bf948001014eff981)
1*4882a593Smuzhiyun.. SPDX-License-Identifier: GPL-2.0
2*4882a593Smuzhiyun
3*4882a593Smuzhiyun===============
4*4882a593SmuzhiyunBoot Interrupts
5*4882a593Smuzhiyun===============
6*4882a593Smuzhiyun
7*4882a593Smuzhiyun:Author: - Sean V Kelley <sean.v.kelley@linux.intel.com>
8*4882a593Smuzhiyun
9*4882a593SmuzhiyunOverview
10*4882a593Smuzhiyun========
11*4882a593Smuzhiyun
12*4882a593SmuzhiyunOn PCI Express, interrupts are represented with either MSI or inbound
13*4882a593Smuzhiyuninterrupt messages (Assert_INTx/Deassert_INTx). The integrated IO-APIC in a
14*4882a593Smuzhiyungiven Core IO converts the legacy interrupt messages from PCI Express to
15*4882a593SmuzhiyunMSI interrupts.  If the IO-APIC is disabled (via the mask bits in the
16*4882a593SmuzhiyunIO-APIC table entries), the messages are routed to the legacy PCH. This
17*4882a593Smuzhiyunin-band interrupt mechanism was traditionally necessary for systems that
18*4882a593Smuzhiyundid not support the IO-APIC and for boot. Intel in the past has used the
19*4882a593Smuzhiyunterm "boot interrupts" to describe this mechanism. Further, the PCI Express
20*4882a593Smuzhiyunprotocol describes this in-band legacy wire-interrupt INTx mechanism for
21*4882a593SmuzhiyunI/O devices to signal PCI-style level interrupts. The subsequent paragraphs
22*4882a593Smuzhiyundescribe problems with the Core IO handling of INTx message routing to the
23*4882a593SmuzhiyunPCH and mitigation within BIOS and the OS.
24*4882a593Smuzhiyun
25*4882a593Smuzhiyun
26*4882a593SmuzhiyunIssue
27*4882a593Smuzhiyun=====
28*4882a593Smuzhiyun
29*4882a593SmuzhiyunWhen in-band legacy INTx messages are forwarded to the PCH, they in turn
30*4882a593Smuzhiyuntrigger a new interrupt for which the OS likely lacks a handler. When an
31*4882a593Smuzhiyuninterrupt goes unhandled over time, they are tracked by the Linux kernel as
32*4882a593SmuzhiyunSpurious Interrupts. The IRQ will be disabled by the Linux kernel after it
33*4882a593Smuzhiyunreaches a specific count with the error "nobody cared". This disabled IRQ
34*4882a593Smuzhiyunnow prevents valid usage by an existing interrupt which may happen to share
35*4882a593Smuzhiyunthe IRQ line::
36*4882a593Smuzhiyun
37*4882a593Smuzhiyun  irq 19: nobody cared (try booting with the "irqpoll" option)
38*4882a593Smuzhiyun  CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted: 4.14.87-rt49-02410-g4a640ec-dirty #1
39*4882a593Smuzhiyun  Hardware name: National Instruments NI PXIe-8880/NI PXIe-8880, BIOS 2.1.5f1 01/09/2020
40*4882a593Smuzhiyun  Call Trace:
41*4882a593Smuzhiyun
42*4882a593Smuzhiyun  <IRQ>
43*4882a593Smuzhiyun   ? dump_stack+0x46/0x5e
44*4882a593Smuzhiyun   ? __report_bad_irq+0x2e/0xb0
45*4882a593Smuzhiyun   ? note_interrupt+0x242/0x290
46*4882a593Smuzhiyun   ? nNIKAL100_memoryRead16+0x8/0x10 [nikal]
47*4882a593Smuzhiyun   ? handle_irq_event_percpu+0x55/0x70
48*4882a593Smuzhiyun   ? handle_irq_event+0x4f/0x80
49*4882a593Smuzhiyun   ? handle_fasteoi_irq+0x81/0x180
50*4882a593Smuzhiyun   ? handle_irq+0x1c/0x30
51*4882a593Smuzhiyun   ? do_IRQ+0x41/0xd0
52*4882a593Smuzhiyun   ? common_interrupt+0x84/0x84
53*4882a593Smuzhiyun  </IRQ>
54*4882a593Smuzhiyun
55*4882a593Smuzhiyun  handlers:
56*4882a593Smuzhiyun  irq_default_primary_handler threaded usb_hcd_irq
57*4882a593Smuzhiyun  Disabling IRQ #19
58*4882a593Smuzhiyun
59*4882a593Smuzhiyun
60*4882a593SmuzhiyunConditions
61*4882a593Smuzhiyun==========
62*4882a593Smuzhiyun
63*4882a593SmuzhiyunThe use of threaded interrupts is the most likely condition to trigger
64*4882a593Smuzhiyunthis problem today. Threaded interrupts may not be reenabled after the IRQ
65*4882a593Smuzhiyunhandler wakes. These "one shot" conditions mean that the threaded interrupt
66*4882a593Smuzhiyunneeds to keep the interrupt line masked until the threaded handler has run.
67*4882a593SmuzhiyunEspecially when dealing with high data rate interrupts, the thread needs to
68*4882a593Smuzhiyunrun to completion; otherwise some handlers will end up in stack overflows
69*4882a593Smuzhiyunsince the interrupt of the issuing device is still active.
70*4882a593Smuzhiyun
71*4882a593SmuzhiyunAffected Chipsets
72*4882a593Smuzhiyun=================
73*4882a593Smuzhiyun
74*4882a593SmuzhiyunThe legacy interrupt forwarding mechanism exists today in a number of
75*4882a593Smuzhiyundevices including but not limited to chipsets from AMD/ATI, Broadcom, and
76*4882a593SmuzhiyunIntel. Changes made through the mitigations below have been applied to
77*4882a593Smuzhiyundrivers/pci/quirks.c
78*4882a593Smuzhiyun
79*4882a593SmuzhiyunStarting with ICX there are no longer any IO-APICs in the Core IO's
80*4882a593Smuzhiyundevices.  IO-APIC is only in the PCH.  Devices connected to the Core IO's
81*4882a593SmuzhiyunPCIe Root Ports will use native MSI/MSI-X mechanisms.
82*4882a593Smuzhiyun
83*4882a593SmuzhiyunMitigations
84*4882a593Smuzhiyun===========
85*4882a593Smuzhiyun
86*4882a593SmuzhiyunThe mitigations take the form of PCI quirks. The preference has been to
87*4882a593Smuzhiyunfirst identify and make use of a means to disable the routing to the PCH.
88*4882a593SmuzhiyunIn such a case a quirk to disable boot interrupt generation can be
89*4882a593Smuzhiyunadded. [1]_
90*4882a593Smuzhiyun
91*4882a593SmuzhiyunIntel® 6300ESB I/O Controller Hub
92*4882a593Smuzhiyun  Alternate Base Address Register:
93*4882a593Smuzhiyun   BIE: Boot Interrupt Enable
94*4882a593Smuzhiyun
95*4882a593Smuzhiyun	  ==  ===========================
96*4882a593Smuzhiyun	  0   Boot interrupt is enabled.
97*4882a593Smuzhiyun	  1   Boot interrupt is disabled.
98*4882a593Smuzhiyun	  ==  ===========================
99*4882a593Smuzhiyun
100*4882a593SmuzhiyunIntel® Sandy Bridge through Sky Lake based Xeon servers:
101*4882a593Smuzhiyun  Coherent Interface Protocol Interrupt Control
102*4882a593Smuzhiyun   dis_intx_route2pch/dis_intx_route2ich/dis_intx_route2dmi2:
103*4882a593Smuzhiyun	  When this bit is set. Local INTx messages received from the
104*4882a593Smuzhiyun	  Intel® Quick Data DMA/PCI Express ports are not routed to legacy
105*4882a593Smuzhiyun	  PCH - they are either converted into MSI via the integrated IO-APIC
106*4882a593Smuzhiyun	  (if the IO-APIC mask bit is clear in the appropriate entries)
107*4882a593Smuzhiyun	  or cause no further action (when mask bit is set)
108*4882a593Smuzhiyun
109*4882a593SmuzhiyunIn the absence of a way to directly disable the routing, another approach
110*4882a593Smuzhiyunhas been to make use of PCI Interrupt pin to INTx routing tables for
111*4882a593Smuzhiyunpurposes of redirecting the interrupt handler to the rerouted interrupt
112*4882a593Smuzhiyunline by default.  Therefore, on chipsets where this INTx routing cannot be
113*4882a593Smuzhiyundisabled, the Linux kernel will reroute the valid interrupt to its legacy
114*4882a593Smuzhiyuninterrupt. This redirection of the handler will prevent the occurrence of
115*4882a593Smuzhiyunthe spurious interrupt detection which would ordinarily disable the IRQ
116*4882a593Smuzhiyunline due to excessive unhandled counts. [2]_
117*4882a593Smuzhiyun
118*4882a593SmuzhiyunThe config option X86_REROUTE_FOR_BROKEN_BOOT_IRQS exists to enable (or
119*4882a593Smuzhiyundisable) the redirection of the interrupt handler to the PCH interrupt
120*4882a593Smuzhiyunline. The option can be overridden by either pci=ioapicreroute or
121*4882a593Smuzhiyunpci=noioapicreroute. [3]_
122*4882a593Smuzhiyun
123*4882a593Smuzhiyun
124*4882a593SmuzhiyunMore Documentation
125*4882a593Smuzhiyun==================
126*4882a593Smuzhiyun
127*4882a593SmuzhiyunThere is an overview of the legacy interrupt handling in several datasheets
128*4882a593Smuzhiyun(6300ESB and 6700PXH below). While largely the same, it provides insight
129*4882a593Smuzhiyuninto the evolution of its handling with chipsets.
130*4882a593Smuzhiyun
131*4882a593SmuzhiyunExample of disabling of the boot interrupt
132*4882a593Smuzhiyun------------------------------------------
133*4882a593Smuzhiyun
134*4882a593Smuzhiyun      - Intel® 6300ESB I/O Controller Hub (Document # 300641-004US)
135*4882a593Smuzhiyun	5.7.3 Boot Interrupt
136*4882a593Smuzhiyun	https://www.intel.com/content/dam/doc/datasheet/6300esb-io-controller-hub-datasheet.pdf
137*4882a593Smuzhiyun
138*4882a593Smuzhiyun      - Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families
139*4882a593Smuzhiyun	Datasheet - Volume 2: Registers (Document # 330784-003)
140*4882a593Smuzhiyun	6.6.41 cipintrc Coherent Interface Protocol Interrupt Control
141*4882a593Smuzhiyun	https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf
142*4882a593Smuzhiyun
143*4882a593SmuzhiyunExample of handler rerouting
144*4882a593Smuzhiyun----------------------------
145*4882a593Smuzhiyun
146*4882a593Smuzhiyun      - Intel® 6700PXH 64-bit PCI Hub (Document # 302628)
147*4882a593Smuzhiyun	2.15.2 PCI Express Legacy INTx Support and Boot Interrupt
148*4882a593Smuzhiyun	https://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf
149*4882a593Smuzhiyun
150*4882a593Smuzhiyun
151*4882a593SmuzhiyunIf you have any legacy PCI interrupt questions that aren't answered, email me.
152*4882a593Smuzhiyun
153*4882a593SmuzhiyunCheers,
154*4882a593Smuzhiyun    Sean V Kelley
155*4882a593Smuzhiyun    sean.v.kelley@linux.intel.com
156*4882a593Smuzhiyun
157*4882a593Smuzhiyun.. [1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/
158*4882a593Smuzhiyun.. [2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/
159*4882a593Smuzhiyun.. [3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/
160