crabbedhaloablut has quit [Ping timeout: 240 seconds]
crabbedhaloablut has joined #linux-ti
rogerq has quit [Remote host closed the connection]
<Pali>
NishanthMenon: Hi! I would like to ask you, do you have some more details about J721E errata i2086 (PCIe - PCIe: MMA Unsupported Request (UR) or Configuration Request Retry Status (CRS) in Configuration Completion Response Packets Results in External Abort) as documented in SPRZ455?
<Pali>
Or do you know somebody who has details and possible workarounds for this issue?
<Pali>
I guess that it comes from PCIe IP, which is IIRC one from Cadence.
rogerq has joined #linux-ti
<NishanthMenon>
Pali: sorry, i got into a discussion with maz on kvm and gic-v2 emulation.. checking
<NishanthMenon>
Pali: as far as I can see: not-affected: AM64x_SR1.0, J721s2 SR1.0, J7200 SR1.0, J7200 SR2.0; impacted: J721e SR1.0, J721e SR1.1
<NishanthMenon>
Workaround per internal notes: "PCIe user guide needs to be updated to include software workaround options for UR/AXI slave error response during enumeration" -> I dont know how that can actually be worked around in linux
<Pali>
"options for UR/AXI slave error response during enumeration" --> this is something configurable by registers and is PCIe IP or SoC specific
<Pali>
but I have not found which register could configure this
<NishanthMenon>
Pali: copying kishon's response: "For the current version of the IP, there are no feasible workarounds. We are asking Cadence to fix this in the next version of IP by adding a bit in the Cadence Local Management register that can be configured by SW when we doesn't want to propagate slave errors."
<Pali>
ah :-(
<Pali>
So there is no configuration register.
<NishanthMenon>
not on 1.0 and 1.1 looks like
<Pali>
This is common error in more real-world PCIe implementations...
<NishanthMenon>
yup.. i wonder how cdns instances on other SoCs behave..
<NishanthMenon>
funny why we were the ones to catch it.. hmmm...
<Pali>
Broadcom has same issues (not sure if they use Cadence) and on linux-pci are reports about those crashes.
<NishanthMenon>
uggh
<Pali>
Renesas too!
<Pali>
And some Marvell SoCs too!
<NishanthMenon>
oh gosh... TI plays janitor ;)
<Pali>
I have there one Marvell SoC (Armada 3720) which has exactly same issue as J721E.
<NishanthMenon>
at least looks fixed on newer SoCs .. if there is a trigger for a new SR with all level change, j721e might get it, but i doubt that..
<Pali>
I think that this is common issue when HW people misunderstand how all parts on SoC works or incorrectly interpreted PCIe base specs.
<NishanthMenon>
yup
<Pali>
PCIe response codes could be mapped to AXI response, so HW people are doing it.
<Pali>
But AXI slave error is fatal on most ARM cores.
<NishanthMenon>
yep
rcn-ee has joined #linux-ti
<Pali>
So the correct way is to translate PCIe error response to AXI OK response with 0xffffffff read value.
<Pali>
Synopsys Designware PCIe IPs have registers for configuring these mappings.
<Pali>
And nowadays Designware is most commonly used in ARM SoCs.
<Pali>
So lot of new ARM SoCs are not affected by this PCIe issue.
* NishanthMenon
keeps mouth shut on snps and upstream stance
<Pali>
So you have not found any way how to do SW workaround for those aborts?
<Pali>
If kishon have some idea, I would like to discuss about it as some _generic_ solution for linux kernel is really needed due to issues across more vendors.
<Pali>
NishanthMenon: I see that kishon is not online so please forward above message.
<NishanthMenon>
Pali: will do, but I doubt we have figured out a way to do this
rogerq has quit [Remote host closed the connection]
florian_kc has joined #linux-ti
florian_kc has quit [Ping timeout: 240 seconds]
florian has quit [Quit: Ex-Chat]
florian_kc has joined #linux-ti
florian_kc has quit [Ping timeout: 252 seconds]
florian has joined #linux-ti
darkapex has quit [Quit: ZNC 1.7.2+deb3 - https://znc.in]