NishanthMenon changed the topic of #linux-ti to: Linux development for TI SoCs | Logs: https://libera.irclog.whitequark.org/linux-ti/| paste logs in https://pastebin.ubuntu.com/ | Let it rock! Vendor SDK/kernel: Also see e2e.ti.com
minash has quit [Remote host closed the connection]
minash has joined #linux-ti
florian_kc has joined #linux-ti
florian_kc has quit [Ping timeout: 272 seconds]
ikarso has quit [Quit: Connection closed for inactivity]
tomba has joined #linux-ti
rsalveti has quit [Quit: Connection closed for inactivity]
ikarso has joined #linux-ti
tlwoerner has quit [Ping timeout: 268 seconds]
tlwoerner has joined #linux-ti
<jluthra> tomba: yes i've seen similar behaviour, and i too suspected it may be caused by power domain driver
<jluthra> tomba: looking into drivers/base/power/domain.c i do see an option which can disable this behaviour for "unused" devices
rob_w has joined #linux-ti
<jluthra> i would guess that should work for this case as well but not sure
manchaw has joined #linux-ti
Kubu_work has joined #linux-ti
<tomba> jluthra: yep... but that's for debugging purposes, and affects all PDs
<tomba> where should k3conf patches be sent?
<jluthra> tomba: true, its not an actual solution, and idk where such a fix should go the tisci pd driver or the driver core
<jluthra> tomba: i've seen people post k3conf patches both as internal bitbucket PRs, and on the internal patch review list with prefix [k3conf PATCH]
<jluthra> latter should work for you i think, linux-patch-review [at] list.[ti's domain]
<tomba> jluthra: ok, thanks
rob_w has quit [Remote host closed the connection]
<tmlind> tomba: if dss has sysconfig regs, you could configure ti-sysc in the dts, not sure if that would help here though
<tomba> tmlind: that's only for omap platforms, isn't it? and even if it's also for K3s, I don't think it help. This is about things external to the DSS.
<javierm> tomba: I think this a known issue in general, that's why for example Chromebooks (e.g: the Snapdragon HP X2 I've) disable the display in the firmware before booting Linux
<tomba> javierm: ok, interesting. too bad the whole point of this exercise is to keep the boot splash screen enabled until the userspace takes over =).
<javierm> tomba: yeah, in my experience for non-EFI platforms, you can only get that with "clk_ignore_unused pd_ignore_unused"
devarsh_ has joined #linux-ti
<javierm> and I've added a regulator_ignore_unused too for this reason. It landed in v6.8 I think
<tomba> javierm: it's not quite that bad. if you enable the simple fb, it will keep the HW enabled until a proper driver takes over. well, for more complex pipelines with display bridges that might not be enough...
<javierm> tomba: right, if you do have a simple-framebuffer DT node, then you can grub the needed resources but in Fedora we try to boot using the u-boot EFI stub
<javierm> and that will provide an EFI-GOP table, but that doesn't take the needed resources :(
<javierm> s/grub/grab
<javierm> tomba: so yeah, with simple-framebuffer you can force the power-domain to remain enabled even if tidss fails to probe
florian has joined #linux-ti
<javierm> tomba: there's also this .sync_state infra https://www.kernel.org/doc/html/latest/driver-api/driver-model/driver.html but I haven't fully grasped how it work yet
<tomba> javierm: but overall, I think there's the big problem of how to manage the whole boot-splash-screen thing in Linux. but here I'm looking at a more specific part, the fact that the whole board will hang if DSS IP is disabled while it's streaming video. I hope that kind of issues are not very common =).
<tomba> javierm: thanks for the pointer to .sync_state. after reading the doc, I also don't quite understand how it works =).
<javierm> tomba: I remember that had an issue in the past when the bootloader left the display enabled and the DMA engine doing scanouts, but that caused IOMMU page faults when the IOMMU controller was enabled by Linux
<javierm> tomba: yeah, between sync_state, device_links and probe deferral. It's getting all too complicated :)
<javierm> looking at my notes, in that platform I also had to use "initcall_blacklist=iommu_subsys_init" besides the {clk,pd}_ignore_unused
<javierm> tomba: that's why I think that having a flicker free boot and a firmware/bootloader display -> boot splash (using simple{fb,drm}) -> real DRM driver transition is only possible with collaboration from the FW
<tmlind> tomba: we need to use ti-sysc for devices with sysconfig register at least for wake-up events like for wkup_uart0 (pending patches)
<tomba> Do we turn off unused clocks, powerdomains, etc on K3 platforms? I think I don't see that happening, and I thought it's a standard thing.
<javierm> tomba: did you check if the unused clocks that are not gated, are not marked as CLK_IGNORE_UNUSED ?
<javierm> or CLK_IS_CRITICAL
<javierm> I believe in those two cases the common clock framework won't disable them
<javierm> I actually have a BeaglePlay over my desk, let's boot it :)
<tomba> javierm: well, I haven't really looked at this. but if the kernel would turn off unused power-domains, it would mean that DSS would be always broken for me (with boot splash-screen), as I have the tidss as a module and I load it manually.
<tomba> and if the kernel would disable the DSS clocks, then I the display should go black, but it doesn't.
<javierm> tomba: so you have flicker free display from bootloader up to user-space with tidss as a module ?
<javierm> [ 4.700362] clk: Disabling unused clocks
<javierm> [ 11.089406] [drm] Initialized tidss 1.0.0 20180215 for 30200000.dss on minor 0
<javierm> [ 11.373392] tidss 30200000.dss: [drm] fb0: tidssdrmfb frame buffer device
<javierm> this is with tidss as a module too. But my u-boot doesn't enable display...
<tomba> no. dealing with the tidss and splash screen is a separate thing =). I have some patches that keep the picture on the screen quite well, but it's hacky.
<tomba> but here, when talking about disabling the unused resources, you could consider that I have the whole DRM subsystem disabled in the kernel (I don't, but it's all modules and I don't load them).
<tomba> The bootloader sets up the display. The kernel boots, but there are no display drivers loaded. And yet the display stays enabled, i.e. DSS is operating.
<tomba> javierm: hrm... why don't I get that print at all ("clk: Disabling unused clocks")...
<tomba> ah, I see. because TI's 6.1 kernel doesn't have that print at all.
<javierm> tomba: about power domains, I guess that is because since dss is a module, then no "unused" power domains exist by late_initcall() time and then once tidss is loaded and dss matched, then the dev is attached to the pd
<tomba> javierm: well, I added the print, I can see the function called on this 6.1 kernel too. the clocks do stay on, though. clk_summary shows the use counts are zero, but also that "hardware enable" is Y. I'm not sure if that means anything here...
<javierm> tomba: so you don't get that attach dev, probe deferral and detach that causes the pd to be unused
<tomba> javierm: hmm sorry, I don't understand. isn't the DSS power-domain unused, if e.g. the tidss driver is never loaded?
<javierm> tomba: no. I meant that the problem you had (IIUC), was that 1) the tidss driver that was built-in matched your dss device 2) the dss device was attached to a power-domain (which enabled it)
<javierm> 3) the tidss driver probe failed and was deferred 4) the dss device was detached from the PD and finally 5) the PD was disabled
<javierm> but if tidss is a module, then the PD is never attached and enabled before the attempt to disable unused PDs
<javierm> so the kernel just doesn't attempt to disable it
<javierm> tomba: and I guess is similar for the clocks, those are registered by OF but never enabled it, and so are not disabled due unused
<tomba> javierm: no, the built-in vs module doesn't affect the original problem. but I wasn't really talking about that here. There are many problems, so it gets confusing =).
<javierm> tomba: Ok, sorry for the confusion then
<tomba> so here I'm talking of a case where the bootloader enables the display, but the kernel does not have any display driver at all. when the kernel boots, I would presume that DSS clocks and power-domains will get disabled, as they are unused. But that doesn't happen.
<tmlind> tomba: setting up ti-sysc for the top level dss sysconfig would help with that :)
<tomba> tmlind: hmm okay, why? =) I don't see the DSS sysconfig being related to this.
<javierm> tomba: so I think we are talking about the same thing at the end :)
<tomba> (I'm not familiar with ti-sysc, or what it does)
<javierm> tomba: what I meant is that the disable of unused clocks, regulators, PDs are only relevant for _enabled_ resources
<tomba> javierm: could be =). as I said, there are many issues here, some only appear in certain use cases, some in others.
<javierm> if their enable count is 0, then is a no-op
<javierm> tomba: in other words, if no device attempt to grab those resources before late_initcall (which is when all these subsystems try to disable unused resources) then the kernel won't touch it
<tomba> javierm: sorry, I still don't follow =). how is a resource unused if the enable count is > 0? (what exactly is enable_count?)
<tomba> javierm: I thought the whole point of the disable unused resources was to turn off things the bootloader had enabled, but no one in the kernel is using
<javierm> tomba: I don't think that's the case but I may be wrong on this
<javierm> AFAIK is to disable any resources that were enabled but are left unused
<javierm> enabled by the kernel I meant
<javierm> but if the kernel didn't attempt to enable it before, it won't try to disable due being "unused"
<tomba> javierm: how does that happen? if it was enabled, then... it's in use? until someone disables the clock. and if someone disables it, then it's already powered off. I feel like I'm missing some critical piece here =)
<javierm> tomba: correct. That's why I brought the built-in vs module cases
<javierm> because if is built-in, the tidss will try to use some power-domain, clocks, regulators and enable it (but later the probe can fail due some deps not present yet)
<javierm> and since that point in time, the kernel will be aware of those resources being enabled and will disable it as unused if tidss can't probe
<javierm> but if is a module, it can be loaded very late (e.g: is not in the initrd but in the rootfs) and so by the time the kernel has to decide about disabling "unused" resources, those weren't enabled yet
<javierm> tomba: dunno if what I wrote makes sense or not :)
<javierm> tomba: clocks are tricky though, because I think that the whole substree gets disabled. So it could be that some clocks that are not enabled by the kernel, can be gated due sharing a parent with an enabled clk
<tomba> javierm: sorry, I still don't follow. say, the tidss probes, and gets and enables a clock. if it then fails due to EPROBE_DEFER, the tidss probe will disable the clock before it returns. so any resources tidss enabled, will be disabled in the error handling path. so what's there for the kernel to disable as "unused"?
<tmlind> tomba: ti-sysc would manage things in a generic way to enable and disable the top level module, so resets, clocks, domains, sysconfig. it would be idle until the dss ip related modules load
<javierm> tomba: correct. But what about that clock parent? The kernel needs to disable it if no child clk is enabled anymore
<javierm> tomba: same for the power-domain. The driver won't disable the parent domain, is the pmdomain subsystem that needs to disable if doesn't have any dev attached anymore
<tomba> javierm: yes. I thought it's disabled when the last child is disabled (by tidss probe).
<tomba> tmlind: hmm I see, so it's much more than just the IP's sysconfig.
<javierm> tomba: that's my understanding, but I could be completely wrong on this
<tmlind> tomba: yeah see here the pending wkup_uart0 patch for am62: https://lore.kernel.org/linux-arm-kernel/20230912153819.fzp6feqkspczci45@dhruva/T/
<tmlind> additionally you'd need to add the revision register match mask added to drivers/bus/ti-sysc.c
<tmlind> and then the module level stuff would be active if any of the child ip drivers probes
<tomba> tmlind: we would still have issues, though, even if ti-sysc would do magics. on OMAP DSS we had the issue that the DSS video output needs to be disabled before the DSS can be turned off. We seem to have similar case here. So it's not enough to manage clocks, powerdomains, etc, but we also need to manage the DSS's internal operation, if we want to safely turn the DSS off (outside the DSS driver)..
<tmlind> yeah ok, the module level stuff could still be managed by ti-sysc, then you'd have a child dss control module
<tmlind> hmm i think i did implement some dss quirk to drivers/bus/ti-sysc.c to drop the old hwmod stuff a few years ago
<tmlind> tomba: similar to what's done in sysc_pre_reset_quirk_dss()?
<javierm> tomba: https://elixir.bootlin.com/linux/v6.8-rc3/source/drivers/clk/clk.c#L1406 <- this is what I meant that clocks with an enable_count == 0 are just ignored
<tomba> tmlind: yes, I think something like that. although the whole issue is still under study, but it definitely looks like things go very bad if the FW turns off the DSS device while the DSS is active.
<tmlind> tomba: ok
<javierm> tomba, tmlind: I think what is missing in this case is a .is_enabled called in struct clk_ops sci_clk_ops
<javierm> that's why the common clock framework ignores the clocks that have not been enabled, it has no way to know the state left by the bootloader
<javierm> since https://elixir.bootlin.com/linux/v6.8-rc3/source/drivers/clk/clk.c#L226 checks if core->ops->is_enabled exists and fallbacks to return core->enable_count (that is 0)
<tmlind> ok
<tomba> javierm: maybe because is_enabled is not allowed to sleep. is_prepared is implemented.
<javierm> tomba: and ti_sci_cmd_dev_is_on() can sleep? Which is what is_prepared seems to use
<javierm> tomba: but I was just explaining why for this platform only clocks that were prepared_enabled would be disabled as unused
<javierm> tomba: I know that the driver may disable in the error path, but that depends on how early probe bailed out with a deferral
<javierm> if happends before the clock was enabled, then nothing will touch whatever state was left by the bootloader
<tomba> javierm: I don't know if it can sleep. But it is sending messages to the FW, and I presume it waits for replies.
<javierm> so what I was trying to say is that clocks enabled by the bootloader will only be disabled in clk_disable_unused() if their clock drivers implement the .is_enabled() callback
<javierm> otherwise the clock framework has no way to know if need to be gated. And if were never enabled, their enable_count will be zero and so will be skipped
<javierm> tomba: ahh, I see what you meant. The clk_unprepare_unused_subtree() function is also called by clk_disable_unused()
<javierm> tomba: sorry for the noise, the clks enabled by the bootloader should be disabled indeed
<javierm> tomba: anyways, thanks for bringing this topic. I learned a lot this morning and apologies for the false statements :)
<tomba> javierm: no, thanks to you =). I learned a lot. Too bad I don't feel I'm any closer to a perfect solution to all-things-related-to-boot-splash =)
<tomba> Btw, does anyone know what's going on with 6.8-rcs? It feels horribly slow at least on TI and Xilinx platforms.
<javierm> a colleague bisect the perf regression to the mentioned commit and confirmed that this patch helps: https://lore.kernel.org/lkml/CAK4VdL3Bg70ycz5vd4RfwNYa3KcYU8rdPX==i7znzQFw_EgTjA@mail.gmail.com/
<tomba> javierm: hey it helps (on xilinx board, which happened to be on my desk now)! thanks again! =)
<javierm> tomba: you are welcome! But all credits to Erico who did the bisection
Kubu_work has quit [Quit: Leaving.]
_whitelogger has joined #linux-ti
mripard has joined #linux-ti
ikarso has quit [Quit: Connection closed for inactivity]
pjw_ has quit []
pjw_ has joined #linux-ti
pjw_ is now known as pjw
devarsh_ has quit [Quit: Connection closed for inactivity]
florian has quit [Quit: Ex-Chat]
tomba has quit [Ping timeout: 272 seconds]
florian_kc has joined #linux-ti
ikarso has joined #linux-ti
eballetbo has quit [Quit: Connection closed for inactivity]
florian_kc has quit [Ping timeout: 272 seconds]
ikarso has quit [Quit: Connection closed for inactivity]
florian_kc has joined #linux-ti
florian_kc has quit [Ping timeout: 272 seconds]