#armlinux on 2022-01-15 — irc logs at libera.irclog.whitequark.org

2021-05-27 16:22 ChanServ changed the topic of #armlinux to: ARM kernel talk [Upstream kernel, find your vendor forums for questions about their kernels] | https://libera.irclog.whitequark.org/armlinux

01:17 Tokamak has quit [Read error: Connection reset by peer]

01:22 Tokamak has joined #armlinux

01:29 narmstrong has quit [Read error: Connection reset by peer]

01:29 steev has quit [Ping timeout: 240 seconds]

01:29 nohit has quit [Ping timeout: 240 seconds]

01:29 narmstrong has joined #armlinux

01:30 nohit has joined #armlinux

01:30 steev has joined #armlinux

01:30 Crofton_ has joined #armlinux

01:30 zx2c4_ has joined #armlinux

01:30 Xogium_ has joined #armlinux

01:39 Crofton has quit [Ping timeout: 240 seconds]

01:39 zx2c4 has quit [Ping timeout: 240 seconds]

01:39 Crofton_ is now known as Crofton

01:39 Xogium has quit [Ping timeout: 240 seconds]

01:39 zx2c4_ is now known as zx2c4

01:39 Xogium_ is now known as Xogium

01:43 milkylainen has quit [Ping timeout: 256 seconds]

01:46 prabhakarlad has joined #armlinux

02:20 Emantor has quit [Quit: ZNC - http://znc.in]

02:21 Emantor has joined #armlinux

02:32 apritzel_ has quit [Ping timeout: 256 seconds]

03:03 lag has quit [Read error: Connection reset by peer]

03:03 vkoul has quit [Read error: Connection reset by peer]

03:03 vireshk has quit [Read error: Connection reset by peer]

03:03 shawnguo has quit [Read error: Connection reset by peer]

03:03 ajb-linaro has quit [Read error: Connection reset by peer]

03:05 shawnguo has joined #armlinux

03:06 shawnguo has quit [Client Quit]

03:07 ajb-linaro has joined #armlinux

03:08 lag has joined #armlinux

03:10 vireshk has joined #armlinux

03:10 vkoul has joined #armlinux

03:33 Pali has quit [Ping timeout: 250 seconds]

05:05 amitk has joined #armlinux

06:01 Tokamak has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

06:05 amitk has quit [Ping timeout: 256 seconds]

07:21 System_Error has quit [Ping timeout: 276 seconds]

07:24 System_Error has joined #armlinux

08:44 apritzel_ has joined #armlinux

08:51 <ardb> arnd: tmlind, this looks like a real error caught by vmap'ed stacks not a regression, i think

08:51 <ardb> https://storage.kernelci.org//ardb/for-kernelci/v5.16-9702-g1df6e064cf9e/arm/davinci_all_defconfig/gcc-10/lab-baylibre/baseline-da850-lcdk.html

08:52 apritzel_ has quit [Ping timeout: 240 seconds]

08:53 <arnd> nice catch ardb! I wonder how it normally terminates the recursion, maybe the driver runs into an error condition after too many attempts to register a child device and unwinds from there

08:54 <arnd> I haven't looked at the code here, but I've seen similar traces from drivers that register a platform_device child and set child->of_node=parent->of_node before registering

08:54 <arnd> which then makes the driver core call into the same driver again

08:55 <ardb> yeah that is what it look like to me

08:58 <tmlind> hmm

09:01 <tmlind> narmstrong: any ideas where the above might be coming from?

09:02 <arnd> or maybe here it's the PLATFORM_DEVID_AUTO bit that sets the device name rather than the of_node

09:06 <arnd> no, that makes no sense. Instead I suspect it's robher's cf081d009c44 ("usb: musb: Set the DT node on the child device") that caused a regression

09:07 <arnd> so it would crash without vmap-stack as well, we just get a more readable stacktrace this way

09:08 <ardb> i would assume so yes

09:08 <arnd> the way that some usb host drivers insert devices to model generic vs soc-specific bits just doesn't work too well with our usual driver model, it keeps causing problems

09:11 <arnd> of_node_reused from 2c1ea6abde88 ("platform: set of_node in platform_device_register_full()") was apparently meant to avoid the recursion, but fails to do the right thing here

09:11 <arnd> do we have a CI log from a machine with sunxi-musb?

09:12 nsaenz has quit [Remote host closed the connection]

09:17 <tmlind> pinephone would have that

09:22 <tmlind> not seeing issues with the musb 2430 glue at least with commit 2c1ea6abde88

09:22 alpernebbi has quit [Ping timeout: 240 seconds]

09:24 <tmlind> sorry i mean with commit cf081d009c44

09:26 alpernebbi has joined #armlinux

09:29 System_Error has quit [Ping timeout: 276 seconds]

09:31 System_Error has joined #armlinux

09:50 tlwoerner has quit [Ping timeout: 256 seconds]

09:57 tlwoerner has joined #armlinux

10:26 apritzel_ has joined #armlinux

11:00 System_Error has quit [Ping timeout: 276 seconds]

11:16 System_Error has joined #armlinux

11:23 System_Error has quit [Ping timeout: 276 seconds]

11:25 System_Error has joined #armlinux

11:36 <tmlind> but that was with next-20220107, next-20220115 only boots for some of my machines, just hangs with no errors

11:50 Pali has joined #armlinux

11:58 System_Error has quit [Ping timeout: 276 seconds]

12:36 headless has joined #armlinux

13:09 headless has quit [Quit: Konversation terminated!]

15:08 jlinton has quit [Quit: Client closed]

15:34 <robher> chipidea usb has also regressed with the same change.

15:40 <robher> of_node_reused has no effect other than with pinctrl.

15:42 <robher> Is the problem that setting the of_node causes a match on the parent driver instead of match by driver name? If so, I think we should be able to to check of_node_reused in the DT matching function and not match when set.

15:42 <robher> arnd, tmlind: ^^^

16:38 apritzel_ is now known as apritzel

17:23 <marex> arnd: hi, maybe you can give me a hint ... I've got this PCI IP, if I readl() from config space and the link is down, I get an Imprecise External Abort, so I cannot "fix it up" in a hook and restart the exact instruction which triggered it, the instruction pointer is a few instructions down the line in the fault handler hook

17:24 <marex> arnd: is there any way I can force it into "precise" abort, so I would get the right instruction address to restart ? or is this imprecise abort due to speculation and thus unfixable ?

17:39 <marex> s@precise@synchronous@

17:48 <ardb> marex: there are other SOCs with the same issue

17:48 <ardb> marex: does the read return the correct value in this case? (all 1 bits)

17:49 <marex> ardb: nope

17:50 <marex> ardb: it returns zeroes

17:50 <ardb> this is a rather severe integration issue, and the only way to paper over it is to only expose the host bridge if the link is up, and pray it doesn't go down

17:50 <marex> ardb: is there a way to block the bridge from ever dropping into L1 link state ?

17:51 <marex> ardb: I didn't find any way to do it

17:51 <marex> ardb: that might really be my only way out now

17:52 <marex> (besides somehow turning the abort into synchronous one and fixing the return value up in the hook)

17:52 <marex> I can detect the link is in L1 state before doing the config space access in the kernel, but what if someone uses pci-utils setpci in userspace ...

18:13 <arnd> marex: have you tried doing the read access using an inline asm with the load plus a barrier, with a fixup handler on both? If you are lucky, the barrier instruction would reliably trigger the fault

18:38 <marex> arnd: I tried a few barriers, yes, none of them triggered the fault though

18:39 <marex> arnd: have you got a barrier instruction in mind ? dsb or dmb I guess ?

18:39 <arnd> marex: I'm not an expert on barriers, I was thinking isb though, which should flush the pipeline

18:40 <marex> arnd: I had isb() there already, but, lemme double-check that

18:40 <arnd> or alternatively something that has a dependency on the data value

18:44 <marex> arnd: I think that dependency is what triggered the abort in my case indeed

18:57 mcoquelin has quit [Ping timeout: 250 seconds]

19:10 mcoquelin has joined #armlinux

19:17 headless has joined #armlinux

19:28 <marex> arnd: that isb in hand-rolled assembler might just be it. it's gonna look great in driver code too :)

19:28 <marex> arnd: thank you

19:29 <arnd> marex: anything's better than just disabling the exceptions as some pci host drivers do

19:30 <marex> arnd: pci ... sigh ...

20:10 amitk has joined #armlinux

21:22 System_Error has joined #armlinux

21:25 headless has quit [Quit: Konversation terminated!]

21:26 amitk has quit [Ping timeout: 256 seconds]

21:26 amitk has joined #armlinux

22:01 Tokamak has joined #armlinux

22:14 amitk has quit [Ping timeout: 250 seconds]

22:56 Tokamak has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]

23:20 mripard has quit [Read error: Connection reset by peer]

23:25 djrscally has joined #armlinux

23:30 mripard has joined #armlinux