redbrain has quit [Read error: Connection reset by peer]
redbrain has joined #u-boot
<lvrp16>
i think a recent change to leds broke default-state device tree node, the DM_FLAG_PROBE_AFTER_BIND doesn't trigger anything so the LEDs never change
<lvrp16>
the LEDs never get probed and does not change according to default-state
sng has quit [Read error: Connection reset by peer]
sng_ has quit [Remote host closed the connection]
Perflosopher has joined #u-boot
pivi has joined #u-boot
apritzel has joined #u-boot
<marex>
lvrp16: which commit ?
ius has quit [Server closed connection]
<marex>
lvrp16: which U-Boot version does not work for you ?
ius has joined #u-boot
<pivi>
marex: back to the "dark side" as you suggested last week
<marex>
pivi: oh hey, was good meeting you
sng has joined #u-boot
<pivi>
marex: same for me!
<marex>
pivi: I am still hoping for rc6 today, with spl: spl_legacy: Fix spl_end address included
<marex>
Tartarus: ^
<marex>
pivi: would you be able to run the CI on that HEAD ?
<marex>
(once it is out)
<pivi>
marex: our CI is currently screwed because of the ARM dts move to subdirectories :-/
<marex>
pivi: ah sigh, that would be a mess indeed
<pivi>
marex: and I have no easy way to just take a stable kernel and HEAD of U-Boot. it's just head of everything. But this fix looks just correct and you tested it in multiple people
<pivi>
marex: what I still do not understand, and I had no time to try yet, is what is going on colibri-imx7 (DRAM: initcall sequence 8786eafc failed at call 8781b351 (err=-3)). This is not related, but it's either a regression or some random failure I have no idea.
<pivi>
marex: I was wondering if this could be related to the unaligned fix that was just merged. today for me is hell, I'll try to find 30mins to test latest HEAD ...
<pivi>
Tartarus: I somehow just managed to get at hit. I was able to reproduce the issue on my desk (U-Boot 2023.07-rc5-00018-gac29400f1f4a), with a different hardware compared to the one at the CI. It is really some regression.
<Tartarus>
OK
<pivi>
git bisect is going to be painful. What I could do is to take Fabio's series, but I am pretty much sure this is not the reason
<Tartarus>
Lets see...
<Tartarus>
Well, wait, this doesn't use SPL right? So you shouldn't need Fabio's series
mmu_man has quit [Ping timeout: 246 seconds]
<pivi>
Tartarus: correct, let me try to find a way to try to test new u-boot in an efficient way and run git bisect over it. previous release was fine for sure
<Tartarus>
OK, thanks
<Tartarus>
Lets see what you dig up and we'll move from there
mckoan is now known as mckoan|away
Net147 has quit [Ping timeout: 240 seconds]
mmu_man has joined #u-boot
<pivi>
Tartarus: here we are, regression from 8c103c33fb14 ("dm: dts: Convert driver model tags to use new schema"). let me triple check
<Tartarus>
pivi: Well, OK, what the heck. That's _not_ a new commit
<Tartarus>
That's something I merged in just after v2023.04 since it was in -next
<pivi>
Tartarus: yes, it's a regression, but not in -rc5. bad on me that I noticed only last week.
<Tartarus>
Lets see what it takes to resolve that, then
goliath has joined #u-boot
<pivi>
Tartarus: so, if it helps. this is happening on imx7s and not on imx7d. this is quite confusing at first. the different is that imx7s has one USB host interface, not two, and one cortex-A core, and not two
persmule has quit [Remote host closed the connection]
persmule has joined #u-boot
vagrantc has joined #u-boot
Net147 has joined #u-boot
Net147 has joined #u-boot
Net147 has quit [Changing host]
<pivi>
anyway, unfortunately I need to stop working on that now - life calling - I will reply to the original report with this additional piece of information
ldevulder has quit [Read error: Connection reset by peer]
ikarso has quit [Quit: Connection closed for inactivity]
ldevulder has joined #u-boot
sng_ has joined #u-boot
sng has quit [Read error: Connection reset by peer]
sng_ has quit [Remote host closed the connection]
<vagrantc>
oh, rc6 today
apritzel has joined #u-boot
apritzel has quit [Ping timeout: 246 seconds]
Gravis has quit [Ping timeout: 246 seconds]
Gravis has joined #u-boot
Gravis has quit [Ping timeout: 260 seconds]
Gravis has joined #u-boot
vagrantc has quit [Quit: leaving]
<marex>
<- is to blame for that obv
pivi has joined #u-boot
<pivi>
marex: so, it's fdt_status_disabled that seems to fail
___nick___ has quit [Ping timeout: 246 seconds]
<pivi>
marex, Tartarus : and the actual code there explains why do we see the issue only on the imx7s, this is because this code is not executed for the dual
<marex>
pivi: hello again
<marex>
pivi: board_fdt_fix() in colibri_imx7.c ?
<pivi>
marex: I guess you saw the email. in any case in colibri_imx7.c:board_fix_fdt() we call fdt_path_offset()
<pivi>
this returns something valid (37504)
<pivi>
and after that we call fdt_status_disabled() and this is where it gets screwed
<pivi>
now, I this is related to the other change, well, that would be the solution of the bug, I guess
<pivi>
I do not really care the result there. either it's working, or I cannot really do anything. halting the boot is just worst
<marex>
wurst
<marex>
pivi: and tbh I just do not know why you didnt trigger it sooner
<marex>
pivi: my guess is, the return value was something else due to nodes being slightly reordered ?
<marex>
i.e. the node offset was different
<pivi>
marex: I gonna spend a few more mins on it, at this point I want to really understand it ...
<pivi>
- bootph-all;
<pivi>
+ u-boot,dm-pre-reloc;
<pivi>
this simple change, in arch/arm/dts/imx7d-colibri-eval-v3-u-boot.dtsi
<pivi>
prevent this issue to appear at all
<pivi>
and this is what was introduced in 8c103c33fb14
<pivi>
I'll do a couple more test, see you in a while or you'll get the patch on the ML ...
<marex>
pivi: can you print the return value of fdt_status_disabled() before / after that change ?
<pivi>
marex: yes, this is my plan now
<marex>
pivi: please send the patch, it is your discovery
<marex>
pivi: :)
<pivi>
marex: I will, no worries
<pivi>
marex: ok. before that change fdt_set_node_status() -> fdt_setprop_string() was retuing 0. this is all good, no issue. after that change it returns FDT_ERR_NOSPACE
<pivi>
marex: in the end it seems reasonable to check that return value. however we need to take care of this now
<marex>
pivi: fdt_resize ?
<pivi>
ye
<pivi>
marex: that is somehow not 100% clear to me however - why we should have now such a change - I wonder if this is all expected
<Tartarus>
pivi: cc Simon (sjg)
<marex>
pivi: if the DT does not have enough space, you cannot create new entries in it
<pivi>
marex: clear, this code is changing `status="okay"` to `status="disabled"`. Now, I have no idea on how this is encoded in the binary blob, but I get your point
<pivi>
marex: nobody seems to care about this situation in the current u-boot code, however
<pivi>
marex: it's not just this board that does not resize the fdt in this cse
<marex>
pivi: there is a bit of extra padding encoded in the DT when it is compiled I think
<marex>
dtc -S 4096 in Linux I think
<pivi>
marex: yep, but currently nobody in u-boot code take care of this error situation. I expect other board to have the exact same issue
<pivi>
marex: disable_fdt_node(void *blob, int nodeoffset)
<pivi>
marex: in imx8/fdt.c is the "correct" example
<pivi>
marex: but nobody is using such a helper ...
<pivi>
marex, Tartarus: I'll definetely write some details over email and sleep over it, for sure Simon will have more insight
ldevulder has quit [Remote host closed the connection]
ldevulder has joined #u-boot
apritzel has joined #u-boot
<marex>
board/CZ.NIC/turris_omnia/turris_omnia.c: if (fdt_status_disabled(blob, node) < 0)
<marex>
also correct
<pivi>
marex: what do you mean with correct? it will not run fdt_increase_size() in case of FDT_ERR_NOSPACE.
pivi has quit [Quit: gone]
goliath has quit [Quit: SIGSEGV]
<marex>
I think you should run that before doing any modifications to the DT
<Tartarus>
Sigh
<Tartarus>
On the one hand, I do want to see what Simon says since, oh right, we must have something in general going on such that we don't constantly trip over this, I want to say because we don't allocate (at run time) exactly the dtb size, but some slight rounding of said size
<Tartarus>
On the next, I don't see (but I could have missed) where in v6.4-rc4 the kernel passes -S to dtc, and arm/arm64 aren't (but powerpc does) passing -p N for N additional padding either
<Tartarus>
We can't just pad everyone's dtb too, as a way to avoid this as now we could run in to the case where the dtb doesn't fit where it's supposed to reside, esp if we go for a big number of byte padding like some cases do
<Tartarus>
So it comes back to what's the reason why many times we can do a little dtb growth and not run out of space
wooosaiiii has quit [Ping timeout: 246 seconds]
<Tartarus>
As indeed, fdt_setprop_string is one of the functions that doesn't check for out of space and then allocate a little more and leaves it up to the caller to handle