#yocto on 2021-06-12 — irc logs at libera.irclog.whitequark.org

2021-06-01 12:48 dl9pf changed the topic of #yocto to: Welcome to the Yocto Project | Learn more: http://www.yoctoproject.org | Join the community: http://www.yoctoproject.org/community | Channel logs available at https://www.yoctoproject.org/irc/ and https://libera.irclog.whitequark.org/yocto/ | Having difficulty on the list, or with someone on the list? Contact YP community mgr Nicolas Dechesne (ndec)

00:10 BCMM has quit [Quit: Konversation terminated!]

00:10 prabhakarlad has quit [Quit: Client closed]

01:19 goliath has quit [Quit: SIGSEGV]

01:20 Emantor has quit [Quit: ZNC - http://znc.in]

01:20 Emantor has joined #yocto

01:28 halstead has quit [Ping timeout: 272 seconds]

01:48 rcw has quit [Quit: Leaving]

02:03 hpsy has joined #yocto

02:04 hpsy1 has quit [Ping timeout: 268 seconds]

02:34 sakoman has quit [Quit: Leaving.]

03:28 Vonter has joined #yocto

04:25 camus has joined #yocto

04:39 paulg has quit [Ping timeout: 252 seconds]

05:15 halstead has joined #yocto

07:59 Vonter has quit [Ping timeout: 252 seconds]

08:20 Vonter has joined #yocto

08:29 <ant_> RP: when you're finished bisecting, there is another unexplicable issue here, much simpler I hope

08:30 <ant_> for collie, just for this machine, the cpio is not created.

08:30 <ant_> the initramfs image, instead, is jffs2 and tar.gz

08:31 <ant_> since eons we include the same .inc for all machines, setting

08:31 <ant_> INITRAMFS_FSTYPES ?= "cpio.gz cpio.xz"

08:31 <ant_> something is really broken if it fails just for armv4

08:33 <ant_> both bitbake -e and the jsonn data have the right value inside

08:33 <ant_> "IMAGE_FSTYPES": "tar.gz jffs2 jffs2.sum",

08:33 <ant_> "IMAGE_FSTYPES_DEBUGFS": "tar.gz",

08:33 <ant_> "IMAGE_FSTYPES_collie": "tar.gz jffs2 jffs2.sum",

08:33 <ant_> "INITRAMFS_FSTYPES": "cpio.gz cpio.xz",

08:33 <ant_> boh !

08:54 <RP> ant_: this is with master?

08:54 <ant_> yes

08:54 <ant_> -next actually

08:55 <RP> ant_: to reproduce that I'd need meta-handheld, a collie build and it would fail at which point?

08:55 <ant_> kernel cannot find cpio

08:55 <ant_> is not built, just the jffs2 and tar.gx in deploydir

08:55 <ant_> seems I fix it adding INITRAMFS_FSTYPES_collie ?= "cpio.gz cpio.xz"

08:56 <ant_> wait a min

09:00 <RP> ant_: ERROR: Layer meta-handheld is not compatible with the core layer which only supports these series: hardknott honister (layer is compatible with sumo thud)

09:01 <ant_> ah, paul has pull requests since some time...

09:02 <ant_> ok, seems the issue is the weak assignment

09:02 <ant_> inherit image

09:02 <ant_>

09:02 <ant_> IMAGE_FSTYPES = "${INITRAMFS_FSTYPES}"

09:02 <ant_> +IMAGE_FSTYPES_collie ?= "cpio.gz cpio.xz"

09:02 <RP> ant_: the maintainers file doesn't even have a correct email address for paul :/

09:02 <ant_> even adding this to the image it fails

09:03 <ant_> I must use _collie =

09:03 <ant_> now see setting it back in zaurus.inc

09:05 <RP> ant_: have you a branch somewhere which makes this work with master? also, perhaps you should just have push access to this layer? :)

09:05 <ant_> https://github.com/andrea-adami/meta-handheld

09:06 <ant_> https://github.com/bluelightning/meta-handheld/pulls

09:06 <ant_> for the rest no changes, just add layer name (and do not exceed max length ;)

09:08 <ant_> no, the setting in the .inc are intercepted :

09:08 <ant_> some class does reset the IMAGE_FSTYPES

09:09 <ant_> that's wrong

09:13 <RP> ant_: ok, I think I now understand. These are the images in meta-initramfs ?

09:14 <RP> ant_: it is entirely expected that if something does: IMAGE_FSTYPES = "${INITRAMFS_FSTYPES}" then IMAGE_FSTYPES_collie": "tar.gz jffs2 jffs2.sum" from configuration would override the INITRAMFS_FSTYPES setting :(

09:15 <RP> ant_: python () { d.setVar("IMAGE_FSTYPES", d.getVar("INITRAMFS_FSTYPES")) } in those image recipes would likely fix that

09:17 * RP notes that setting an incompatible layer causes bitbake to hang

09:18 <ant_> but why was it ok before?

09:18 <ant_> I did build linux-kexecboot for collie back 2017-2018

09:19 <ant_> note bitbake -e says "INITRAMFS_FSTYPES": "cpio.gz cpio.xz"

09:22 <RP> ant_: that doesn't matter, what matters is "MACHINE=collie bitbake initramfs-kexecboot-image -e | grep IMAGE_FSTYPES"

09:22 <RP> it is the value that IMAGE_FSTYPES gets set to which is the issue

09:23 <ant_> hen must be weakened in the image

09:23 <ant_> IMAGE_FSTYPES ?= "${INITRAMFS_FSTYPES}"

09:24 <RP> ant_: the problem is the machine override is winning and that won't help. You need the anonymous python I mentioned to "win" compared to the machine override

09:26 <ant_> INITRAMFS_FSTYPES RP: I start to remember...collie had not enough blocks for ubi so I special-cased it

09:26 <ant_> ah ha

09:26 <ant_> my bad :)

09:26 <ant_> after that I did not compile the image anymore posibly

09:26 <ant_> (for collie)

09:26 <ant_> ok, I better remove that in zaurus.inc

09:27 <RP> ant_: well, I think that assignment in the image recipe is dangerous and would be better as anon python too

09:29 <ant_> https://git.openembedded.org/meta-handheld/commit/conf/machine/include/zaurus.inc?id=4ef8f82e5db28f50901ce87f7ce786675aee6adf

09:29 <ant_> sigh

09:30 <ant_> RP: afais core-image-minimal-initramfs does the same

09:31 <ant_> RP: fix thisone pls, so we then fix meta-initramfs

09:36 <ant_> RP: btw ubi on collie rules :)

09:37 <RP> ant_: I'll test http://git.yoctoproject.org/cgit.cgi/poky/commit/?h=master-next&id=f93a0ddcbf86f2bac442e130859b2e25d8f1de71

09:51 <ant_> I'm testing it, thanks

09:52 <ant_> RP: I must manage better the build of ubifs exceeding the aval eraseblocks

09:52 <ant_> iirc now build continues if one fstype fails, unsure

09:53 <ant_> now = 5 yrs later

09:53 <ant_> :)

10:36 <ant_> RP: doesn't seem to solve :/

10:40 <RP> ant_: you applied that to the two other initramfs recipes?

10:41 <ant_> yes

10:41 <ant_> https://pastebin.com/MHnbYqFZ

10:46 <RP> ant_: gives IMAGE_FSTYPES="cpio.gz cpio.xz" as the value now

10:47 <RP> ant_: what problem are you seeing?

10:48 <ant_> I don't have any cpio in deploydir, the image is built as IMAGE_FSTYPES jffs2/tar.gz

10:48 <ant_> │ initramfs-kexecboot-klibc-ima~e-20210612104354.rootfs.jffs2│ 14336K│giu 12 12:44││ │ │ │

10:48 <ant_> │ initramfs-kexecboot-klibc-ima~210612104354.rootfs.jffs2.sum│ 262144│giu 12 12:44││ │ │ │

10:48 <ant_> │ initramfs-kexecboot-klibc-ima~0210612104354.rootfs.manifest│ 121│giu 12 12:44││ │ │ │

10:48 <ant_> │ initramfs-kexecboot-klibc-ima~-20210612104354.rootfs.tar.gz│ 117835│giu 12 12:44││ │ │ │

10:48 <ant_> │ initramfs-kexecboot-klibc-ima~-20210612104354.testdata.json

10:49 <ant_> the origina sin is my commit in meta-handheld :)

10:49 <ant_> but this discovers some other issues I thin

10:52 <RP> ant_: try putting the anon python before the inherit image part

10:52 <ant_> yea, I was doing that :)

10:52 <ant_> that's th eproblem

10:52 <ant_> now it is ok

10:53 <ant_> btw while here I will inherit core-image

10:53 <ant_> (didn't exist back then prolly)

10:55 <ant_> RP: it's always a surprise to exactly catch when a var is set/evaluated with overrides :)

11:01 <ant_> RP: ha ha, with latest tc we have inflated

11:01 <ant_> | DEBUG: Executing shell function do_sizecheck

11:01 <ant_> | WARNING: This kernel zImage (size=1024(K) > 1024(K)) is too big for your device.

11:01 <ant_> did still fit some years ago

11:07 <ant_> RP: this is the meta-initramfs part then https://pastebin.com/zzzQyjwQ

11:15 <RP> ant_: if image works, just use image

11:16 <RP> ant_: but yes, that looks good.

11:16 <RP> ant_: btw, the kexec image fails with package_rpm

11:17 <RP> ant_: updated the patch in master-next

11:23 <ant_> thanks

11:24 <ant_> KERNEL_IMAGE_MAXSIZE_collie = "1024"

11:24 <ant_>

11:24 <ant_> *zImage │1047288│

11:26 <ant_> I think these few bytes are the added empty dirs (h

11:27 <ant_> boh -rwxr-xr-x 1 andrea andrea 1023K giu 12 12:59 zImage

11:27 <ant_> why does test now fail?

11:50 <ant_> RP: it seems in the years xz has improved and needs much less ram for kernel-decompression

11:51 <ant_> I did set this at the time

11:51 <ant_> XZ_COMPRESSION_LEVEL = "-2e"

11:51 <ant_> because poodle (32MB ram) could not decompress the kernel otherwise

11:56 <ant_> wellm with XZ_COMPRESSION_LEVEL_collie = "-7e" is even bigger *zImage │1047376

11:57 <ant_> neverthless is always < 1048576, test is bogus

11:57 <ant_> hell

12:03 <ant_> RP: it is rounded up to kb so it fails

12:04 <ant_> INTEGER1 -ge INTEGER2

12:04 <ant_> INTEGER1 is greater than or equal to INTEGER2

12:04 <ant_> if [ $size -ge ${KERNEL_IMAGE_MAXSIZE} ]; then

12:05 <ant_> once it was on bytes iirc, now the equal part is disturbing

12:05 <ant_> (I know this is probably the smallest, unique, OE kernel with 1MiB limit :)

12:16 Shaun_ is now known as Shaun

12:19 <ant_> RP: final note, do_sizecheck() was born for collie :)

12:19 <ant_> bbl

13:14 paulg has joined #yocto

14:00 <vmeson> RP: I finally have another BUG: in the bisect, on: 7c4c016a3d linux-yocto/5.10: update to v5.10.37 -- I'll continue with bisect.

14:00 BCMM has joined #yocto

14:05 <RP> vmeson: cool. I will admit I've focused on other things today, I have a backlog of other issues

14:21 dlan has quit [Ping timeout: 252 seconds]

14:22 droman has quit [Ping timeout: 252 seconds]

14:22 LocutusOfBorg has quit [Ping timeout: 252 seconds]

14:22 LocutusOfBorg has joined #yocto

14:22 stkw0 has joined #yocto

14:22 dlan has joined #yocto

14:40 goliath has joined #yocto

14:47 ant_ has quit [Ping timeout: 265 seconds]

14:51 kpo_ has quit [Read error: Connection reset by peer]

14:52 kpo_ has joined #yocto

14:54 BCMM has quit [Ping timeout: 264 seconds]

15:17 <vmeson> RP only 1 failure out of 15 runs! I'm going to do another 15 at this commit id.

16:09 ant_ has joined #yocto

16:15 <RP> vmeson: well, once you have a fail, you know it is bad

16:20 <ant_> RP: should I increment the KERNEL_MAXSIZE of poor collie or adjust the greater/equal check =

16:21 * paulg is looking at dentry code that contains gems like "name->name.name = name->inline_name;"

16:22 <ant_> RP: it is really (lower) border-case

16:24 <paulg> FWIW, I got another "long" run w/o ever getting a failure - so it seems you can get VM boots that are immune ; testing with "-c testimage" seems to be required.

16:24 <RP> ant_: the test probably needs tweaking if it isn't working correctly but I'm not 100% sure what the issue is

16:24 <ant_> rounding

16:25 <RP> ant_: so can we rework it to avoid the rounding?

16:25 <paulg> vmeson, testimage takes some timeouts, vs manually watching for a hang ; I've added these to local.conf but not (yet) started another run with them...

16:25 <RP> paulg: fun. I just couldn't face any more of it this weekend

16:25 <paulg> TEST_QEMUBOOT_TIMEOUT = "200"

16:25 <paulg> TEST_OVERALL_TIMEOUT = "360"

16:25 <ant_> accepting the equal case?

16:25 <ant_> This kernel zImage (size=1024(K) > 1024(K)) is too big

16:26 <RP> paulg: I did also notice there is an ltp option to inject the test name into the kernel log, been meaning to try that

16:26 <paulg> vmeson, the 360 is 'cause on ala3 the test always has completed in under 5m

16:26 <RP> ant_: I'd accept that

16:26 <paulg> RP, afaik it is enabled, I've been seeing that since the get-go

16:26 <RP> paulg: hmm, I'm not

16:27 <paulg> well, i've been seeing it on my manual runs, on the console...

16:27 <paulg> you are right ; they aren't in my testimage qemu logs tho..

16:28 <paulg> [54843.160382] LTP: starting memcg_usage_in_bytes (memcg_usage_in_bytes_test.sh)

16:28 <paulg> [54845.518415] LTP: starting memcg_control (memcg_control_test.sh)

16:28 <paulg> [54851.574800] LTP: starting cpuset_regression_test (cpuset_regression_test.sh)

16:28 <paulg> [54851.621707] LTP: starting cgroup_xattr

16:28 <paulg> [54851.626250] new mount options do not match the existing superblock, will be ignored

16:28 <paulg> etc etc

16:28 <ant_> -if [ $size -ge ${KERNEL_IMAGE_MAXSIZE} ]; then

16:28 <ant_> +if [ $size -gt ${KERNEL_IMAGE_MAXSIZE} ]; then

16:28 <RP> paulg: right, different ltp commands I guess

16:29 <RP> ant_: the autobuilder blew up with my initramfs patch :(

16:30 <ant_> how so?

16:31 <RP> ant_: Adding IMAGE_FSTYPES += ' hddimg' to local.conf, then "MACHINE=genericx86 bitbake -p" blows up in parsing

16:32 <RP> https://autobuilder.yoctoproject.org/typhoon/#/builders/58/builds/3505

16:34 <ant_> I admit never building that one

16:34 <RP> ant_: qemux86 does it too

16:35 <RP> ant_: its another ordering issue

16:36 <ant_> RP: then do :/oe/oe-core/meta$ grep -R INITRAMFS_FSTYPE

16:36 <ant_> there are more IMAGE_FSTYPES = "${INITRAMFS_FSTYPES}"

16:36 <vmeson> another 10 runs at this commit and no errror. I expect/wonder if it's more common under heavy load...

16:36 * vmeson starts a seperate world build in a loop to find out.

16:37 <RP> vmeson: I would have been running multiple builds on my machine

16:40 <ant_> overrides and includes are indeed a fragile thing

16:41 <RP> ant_: it is a mess, there isn't actually any way to do this as even := won't clear the overrides

16:42 <RP> kergoth: what is you view on XXX_machine = "X"\n XXX := "Y" with machine in OVERRIDES?

16:43 <RP> We don't appear to have a test case for that

16:44 <ant_> in C it's easier...#ifndef

16:46 <ant_> RP: after surviving to the sizecheckI still have the obstacle in kernel.bbclass, do_deploy does check packaging even if packaging is disabled

16:46 <vmeson> paulg: hope you don't mind the background builds on ala3... load is b/w 50-90% busy. poor monster machine.

16:48 <ant_> once youguys solve the kernel issue I'll come back with these minor things

16:48 <paulg> vmeson, for the moment I've been just USTL and not testing today

16:48 <paulg> coding up some debug stuff to better get the kernel to tell us what corrupted (hopefully)

16:49 <vmeson> In other news, after years of our build cluster running, I've decided that since we often build w/o sstate and then rm -rf, we should try to avoid sstate generation completely.

16:49 <vmeson> paulg: k, let me know if you want the machine to be less busy.

16:50 * vmeson goes away for an hour or so

16:54 <RP> kergoth: I think our current behaviour is wrong, or would be better as the other anyway...

16:54 * RP wonders how much that would break

16:55 <paulg> vmeson, ha ha ha.

16:55 <paulg> [06/01 11:44] <paulg> sstate could vanish overnight and I'd probably not notice, or perhaps even be happier for it.

16:55 <paulg> [06/01 11:49] <paulg> at least they aren't like autoconf - a solution to a 1985 problem. B-)

16:55 <paulg> [06/01 11:45] <paulg> kinda like distcc and ccache type stuff. Seem like dated solutions to a 2005-ish problem.

16:56 <paulg> Just don't let RP hear you throwing shade on sstate.... ;-)

17:04 <RP> paulg: Lets just say I disagree about sstate. I'm not so keen on distcc/ccache though

17:06 <paulg> :-) Couldn't resist poking some fun, seeing as I'm sitting here suffering anyway.

17:11 camus has quit [Quit: camus]

17:17 <vmeson> RP, is there a flag to avoid generating sstate aleady? I looked but didn't see one. I expect it'd cut build times in our cluster by a few %. I can create a bugzilla enhancement if needed.

17:17 ant_ has quit [Ping timeout: 264 seconds]

17:18 <paulg> I think this one is new... https://paste.debian.net/1200977/ at least to me.

17:19 <paulg> core rcu code is running and the code page vanishes?!? wtf.

17:19 <paulg> qemu "hardware" sure seems baked.

17:51 <paulg> vmeson, I think you killed our test box.

17:59 BCMM has joined #yocto

18:09 <vmeson> paulg: yikes. shells are hung, can't ssh but the box is pingable. I've txted Konrad asking if he has time to reset it.

18:14 <paulg> doesn't answer pings ; maybe he reset it

18:20 <vmeson> yep, it's back , continuing with one world buid this time...

18:25 <paulg> except the reboot broke networking

18:27 <vmeson> it had completed 32 tests total before the reset - still only 1 BUG: -- that's not good news for being able to identify where this bug was introduced.

18:27 * vmeson ploughs ahead with the bisect

18:30 <vmeson> I guess I should call this one 'bad' but given the probablility of an accurate signal , this is likely pointless.

18:30 * vmeson calls it bad and ploughs on.

18:34 <vmeson> paulg: networking, you mean tun/tap?

18:36 Guest22 has joined #yocto

18:38 <vmeson> paulg: fixed

18:41 <paulg> thanks

18:47 Guest22 has quit [Ping timeout: 250 seconds]

18:55 camus has joined #yocto

18:59 camus has quit [Ping timeout: 252 seconds]

18:59 camus has joined #yocto

19:13 davidinux has quit [Ping timeout: 252 seconds]

19:15 davidinux has joined #yocto

19:43 ant_ has joined #yocto

19:48 <kanavin> RP, fray, I'm hitting a prelink issue on ppc32: https://autobuilder.yoctoproject.org/typhoon/#/builders/63/builds/3513/steps/13/logs/stdio

19:49 <kanavin> one of the wayland libraries is incorrectly relocated at do_image time

19:49 <kanavin> which begs the question: how useful is prelink? How about disabling it?

19:50 <vmeson> summary of what I've seen on this ltp -> kernel BUG: : https://paste.debian.net/1200991/

19:50 * vmeson takes a few hours off

19:57 <kanavin> I see prelink was disabled/enabled a couple times in the past, so maybe it's time to do that again...

20:23 <kanavin> I filed https://bugzilla.yoctoproject.org/show_bug.cgi?id=14429 for it

20:32 Vonter has quit [Ping timeout: 252 seconds]

21:15 camus has quit [Remote host closed the connection]

21:22 <RP> kanavin: for better or worse it is useful in that it reduces memory usage...

21:25 <RP> vmeson: I really dislike the idea of allowing sstate generation to be turned off

21:25 <RP> vmeson: I'd bet you can hack it to remove the file generation just be zeroing out some functions

21:26 <RP> vmeson: in fact the more I think about it, the more I think you know not what you ask for. I can/will explain but not now

21:33 <kanavin> I tend to agree with RP, this 'we don't need sstate' heresy has to stop

21:36 <moto-timo> Pretty sure #freenocommonsense doesn’t want to exist anymore. I already dropped from any relevant community channels but now I’m kicked (and burned the bridge)

21:37 <ant_> heh, just one day of bisecting makes you ramble

21:39 <moto-timo> We need sstate.

21:40 <RP> I can understand why vmeson wants to disable generation of it if it isn't used but its just going to complicate the test matrix and generate new bugs where odd things happen if we don't generate it

21:57 <moto-timo> CFP for ELC closes tomorrow. What would you like me to talk about?

22:05 vmeson has quit [Ping timeout: 244 seconds]

22:14 vmeson has joined #yocto

22:32 Vonter has joined #yocto

22:45 sakoman has joined #yocto

23:02 BCMM has quit [Quit: Konversation terminated!]

23:26 hpsy1 has joined #yocto

23:26 hpsy has quit [Read error: Connection reset by peer]

23:38 goliath has quit [Quit: SIGSEGV]