zyga has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
tnovotny has quit [Quit: Leaving]
zyga has joined #yocto
<RP>
pseudo changes to work with uninative break it for eSDK :(
* RP
wonders if we have anyone who fancies trying to fix it?
ant_ has quit [Quit: Leaving]
<RP>
vmeson, zeddiii: This *maybe*, just maybe an issue between qemu 5.1.0 and 5.2.0
mihai has joined #yocto
kpo_ has quit [Read error: Connection reset by peer]
kpo_ has joined #yocto
fiorentinoing has joined #yocto
pbaptista has quit [Ping timeout: 250 seconds]
* zeddiii
RP, I'd hope fray might be the guy to work on the psuedo thing
<zeddiii>
RP: interesting on the qemu, I saw your swat email update. I'll tweak my kernel config more here. My test ran over night with no issues.
<RP>
zeddiii: I can't tell if this is the same kernel bug or not with gatesgarth :/
<RP>
It is looking like vanilla 5.2.0 qemu is ok but 5.2.0 with CVE fixes breaks
<RP>
only 27 patches to choose from
prabhakarlad has quit [Ping timeout: 250 seconds]
<zeddiii>
and our qemu 6.0 in master inherits that same issue that was introduced in that window. For a second, I thought you meant we had backported patches and it was only on those versions
<RP>
zeddiii: it looks like setting CONFIG_SCHED_DEBUG "fixes" the kernel and stops the error happening
<RP>
zeddiii: so I guess that means the situation is defconfig sensitive
<zeddiii>
some sort of scribbler could be avoided with the extra debug space or different code path. I'll have a look at what that really changes code wise.
<zeddiii>
hah. I just noticed now that I got an extra 'i' ...
zeddiii is now known as zeddii
zeddii is now known as zedd
* zedd
has defeated the iis
<RP>
zedd: do we know you?
<zedd>
in my haste to get to libera, it never dawned on me that zedd might not be registered :D
<JPEW>
Hmm, apparently I've been banned from freenode completely
<zedd>
bad behaviour.
<RP>
JPEW: Crofton can tell you about that I think
prabhakarlad has joined #yocto
<Crofton>
reconnect it will go through
<Crofton>
I assume they are cleaning unattended accounts
<Crofton>
wel that is one theory
<yannd>
I'm surprised, in gartesgarth/hardknott mesa.bb does not provide virtual/egl, virtual/libgles2 and friends, though several packages depend on it - shouldn't mesa-with-panfrost take care of that ?
hpsy has quit [Ping timeout: 264 seconds]
<RP>
zedd: for fun I turned on default debug stuff apart from SCHED_DEBUG and it still breaks so there is something about that option
<yannd>
duh, bad PACKAGECONFIG probably, sry for the noise
<fiorentinoing>
coming back to icedtea7-native, someone has the url of a MIRROR of this package of meta-java? (I'm stuck to morty till August)
<JPEW>
Crofton: Ya, that worked
* RP
is starting to hate bisection
fiorentinoing has quit [Quit: Client closed]
* vmeson
plods ahead with the bisect: scripts/buildhistory_analysis: Avoid tracebacks from file comparision code (Thu Oct 29 15:21:35 2020) has no BUG
<vmeson>
next step of 10! 41f96b141e oeqa/ethernet_ip_connman : add test for network connections
hpsy has joined #yocto
paulg has joined #yocto
<RP>
vmeson: will be interesting if you reach the some conclusion I do about the cause. I appear to have narrowed it to a single qemu patch. It is just rather odd
<paulg>
sounds like I missed the "big reveal"...
<RP>
paulg: its weird enough I'm not prepared to say :)
<paulg>
I can add this interesting data point - going to what I'd said elsewhere last night - that my "good" boot survived 50 runs in a row overnight, booting exactly what crashed earlier in short order.
<paulg>
[06/10 23:23] <paulg> I'm wondering if you get a "good boot" -- i.e. a fortuitous alignment or similar ; then it stays working - this is where "-c testimage" can let you down ; it reboots each time.
<v0n>
which package provides systemd-escape on the host? (for DEPENDS)
<paulg>
does our build of qemu6 even make an object file for old IDE crap? I can never keep track of how old the hardware qemu is pretending to be ; for all I know it still might be a 1998 i440bx chipset (still!)
zyga-mbp has joined #yocto
zyga-mbp has quit [Client Quit]
<paulg>
the "big" machine seems to be the most reliable reproducer for me ; anyone tried reverting that on master / qemu6 and retesting yet?
<RP>
paulg: it is built and there is a message about ATAPI during boot
<RP>
paulg: the revert is my next test
<paulg>
ok, I'll go get coffee and popcorn then.
<paulg>
and look at the damn patch in more detail.
<zedd>
if our kernel configs are turning on devices that are not really used, but yet trigger the crash when exercised, we can turn them off .. a little sweeping under the rug :P
<zedd>
which just means we'll trip on the rug later and hurt ourselves
zyga has quit [Ping timeout: 264 seconds]
<v0n>
Is there a systemd-extra-utils-native package?
sakoman has joined #yocto
<RP>
zedd: lets see if reverting this really does "fix" it. If it does we should likely be removing the cdrom support from the qemu kernels :)
<RP>
and report this upstream to qemu
<vmeson>
v0n: I don't think there's a systemd-extra-utils-native package. If you explain why you're asking maybe someone can help .
chrfle_ has quit [Remote host closed the connection]
chrfle_ has joined #yocto
<v0n>
vmeson: there's none indeed. I'll escape things myself ^^
* RP
wonders why half the world is rebuilding and decides he doesn't want to know
zyga has joined #yocto
* paulg
copies all qemu-* aside as backups before vandalizing things
goliath has quit [Quit: SIGSEGV]
<RP>
that patch looks ok to me, the only bit I can't quite tell with is the change to unsigned
<RP>
gah, confirmed reverting that from 6.0.0 does not fix things
<RP>
so I've messed up the bisection somewhere
* vmeson
continues to bisect, no on: 43c40ea7e0 libdnf: replace a musl fix with a better one
<paulg>
if you guys are both bisecting qemu ; you probably should pool your qemu "bad" points, since a bad is a bad, but a ten run pass is still just a "maybe ok, not sure."
dti has joined #yocto
Vonter has joined #yocto
<vmeson>
paulg: not sure about RP but I'm bisecting poky!
<paulg>
whee.
dtometzki has quit [Ping timeout: 264 seconds]
<RP>
paulg: I'm bisecting qemu patches on top of 5.2.0 so quite different
<paulg>
I was thinking whether it makes sense to convert qemu from tarball to git and build within yocto; or build it independently...
dti is now known as dtometzki
<RP>
paulg: I was going to do that until I found deleting the 5.2.0 cve patches seemed to work
<paulg>
I recall doing the former for some gfx libs with interdependencies for zedd like 5+ years ago
<JPEW>
I usually do externalsrc for bisecting
<paulg>
of course, life being what it is, I'm sure older qemu bisect points will blow up with gcc-11 just to keep life interesting... :-/
<paulg>
would also want to tell qemu to not build arm/mips/blah blah blah.... in order to speed things up, I'd think....
* RP
suspects 5.2.0 is probably bad on gatesgarth but I just can't provoke it
<paulg>
the fact that I was able to do a run of 50 w/o issue on the identical load that went splat in less than 3 just prior does mean only the bad points are known to be truly bad
<paulg>
everyting else is a "maybe ok".
<paulg>
I've got four fails in 21 runs on the big machine. So anything less than 10 runs w/o fail isn't really a useful data point.
<vmeson>
Yikes...
<vmeson>
paulg: maybe it's cosmic rays!
* vmeson
runs
pbaptista has joined #yocto
<paulg>
oh yay - qemu uses submodules!
pbaptista has quit [Quit: Client closed]
ncaidin_lf has joined #yocto
<RP>
plain 5.2.0 does looks like it has the bug under gatesgarth
<vmeson>
RP: :(
<RP>
vmeson: I still think 5.1.0 is "clean" so it could be some delta between 5.1 and 5.2
* RP
should try 5.1.0 with master
<vmeson>
RP okay. Btw, how many times are you running your tests? Can you confirm paulg's 4 of 21 stat?
<vmeson>
I'm testing 5 times at each bisect.
<RP>
vmeson: I'm more at the 5 times but I did test gateagarth specifically much more
<vmeson>
k, currently on: 5d9a91a2ae uboot: Deploy default symlinks with fitImage -- Bisecting: 253 revisions left to test after this (roughly 8 steps) - Fun!
<RP>
vmeson: will be interesting to see where that puts the issue
LetoThe2nd has quit [Quit: Connection closed for inactivity]
<vmeson>
RP: Yep, that commit has qemu_5.2 so it should show up....
zyga has quit [Quit: Leaving]
<paulg>
ended up building qemu outside of yocto ocne I saw it used submodules...
kpo_ has quit [Read error: Connection reset by peer]
kpo_ has joined #yocto
<paulg>
can one get to the qemu console during testimage? I'd like to just be sure with a version check I'm running what I built and not the old binary or the distro binary somehow.
<RP>
paulg: not easily
<RP>
paulg: I see what you mean about submodules. Ick :/
* RP
throws gitms:// at it
<RP>
gitsm://
<paulg>
I know I'm running my qemu binary 'cause I apparently borked gfx - but testimage is still running and I can ssh into the instance. So who cares about gfx!!!
<paulg>
I built top of tree for starters...
<RP>
paulg: fwiw it looks like the gitsm fetcher can handle this
<paulg>
I wasn't that brave
<paulg>
I gave qemu its own empty install bin dir and then just symlinked those babies into tmp/work/x86_64-linux/qemu-helper-native/1.0-r1/recipe-sysroot-native/usr/bin
<paulg>
ugly, but seems to be good enough for what I want here.
<paulg>
kept the originals of course, in case I want to switch back.
<paulg>
and lo and behold... 1st run with qemu at top of tree..
<paulg>
I've already got all the poo ; I just want to checkout at different points ; not reclone each friggin bisect point.
<paulg>
qemu$git checkout v5.1.0
<paulg>
warning: unable to rmdir 'meson': Directory not empty
<paulg>
M slirp
<paulg>
M ui/keycodemapdb
<paulg>
M capstone
<paulg>
Note: switching to 'v5.1.0'.
<paulg>
Not liking those "M" though
<rburton>
i hate submodules
* paulg
does too, but resorts to RTFM anyway
Vonter has quit [*.net *.split]
dtometzki has quit [*.net *.split]
Emantor[m] has quit [*.net *.split]
shoragan[m] has quit [*.net *.split]
barath has quit [*.net *.split]
cody has quit [*.net *.split]
Andrei[m] has quit [*.net *.split]
dev1990 has quit [*.net *.split]
sgw has quit [*.net *.split]
FO2 has quit [*.net *.split]
JPEW has quit [*.net *.split]
Tartarus has quit [*.net *.split]
CosmicPenguin has quit [*.net *.split]
angolini has quit [*.net *.split]
Shaun has quit [*.net *.split]
goliath has joined #yocto
<gjohnson>
I want to have a variable in my local.conf called SSH_ROOT_LOGIN, based on this I want to have openssh and refpolicy targeted to be configured to allow root to login over ssh for debugging purposes. I can't figure out how to get openssh to rebuild when the SSH_ROOT_LOGIN variable changes. Here is the last thing I tried https://termbin.com/4hsx5
Vonter_ is now known as Vonter
<gjohnson>
I can see the do_configure hash changes when the variable changes but that isn't causing the recipe to be rebuilt.
robbawebba has joined #yocto
mckoan is now known as mckoan|away
<paulg>
"git submodule update" took friggin forever, but now I'm "M" free.
<paulg>
qemu$git checkout v5.1.0
<paulg>
HEAD is now at d0ed6a69d3 Update version for v5.1.0 release
<paulg>
QEMU emulator version 5.1.0 (v5.1.0-dirty)
<paulg>
Not sure why it thinks it is dirty ...
<RP>
paulg: it uses submodules? :)
<paulg>
heh, could be. ANyway "git submodule foreach git status" and git status at top show nuttin' so I'm ignoring it for now
<RP>
paulg, vmeson: with master I just had 5.1.0 fail :(
<paulg>
I'm doing several runs at top-of-tree before moving on to testing ... well maybe not testing v5.10....
<paulg>
I guess to be fair, we've not concretely blamed qemu at this point -- vs. being the prime suspect.
<paulg>
2nd run at top-of-tree didn't barf, so stats there stand at 50/50 with an error margin of 100%. :-/
<RP>
paulg: what is weird is that the NULL derefs happen at about 210s on master yet at 44s on gatesgarth
angolini has joined #yocto
CosmicPenguin has joined #yocto
Tartarus has joined #yocto
JPEW has joined #yocto
<vmeson>
RP: paulg. with 1 run, qemu-5.2 did NOT fail for me. Doing more tests now. I might do 10-20. over lunch break.
<RP>
vmeson: I'm getting varying results. I have seen 5.1.0 fail now
<vmeson>
Yeah, this is going to be annoying...
<paulg>
not to mention a giant time sink.
<RP>
I need to go and get food, think I've done what I can with this today
<vmeson>
This is probably obvious but all the BUG warnings look like a pointer offset problem in that the address given i a small number of bytes: 8, 12, etc.
<paulg>
shame there isn't an unmapped page at the bottom to catch all these foo->bar where foo = NULL and flag them as null deref.... oh wait.
<vmeson>
paulg: sure, so some pointer in the kernel is getting over-written with NULL, could one (i.e. paulg) put a gdb HW breakpoint on the address that is 'usually' involved to see what is doing the corrupting?
<vmeson>
I haven't done that with qemu and the guest kernel but it seems like it might be worth trying. Waste of time paulg?
<vmeson>
my test at: 5d9a91a2ae uboot: Deploy default symlinks with fitImage -- which as qemu_5.2.0 did not get a BUG: 5 times. Running 15 more while breaking for lunch.
leon-anavi has quit [Quit: Leaving]
shoragan[m] has joined #yocto
barath has joined #yocto
Emantor[m] has joined #yocto
BCMM has quit [Ping timeout: 268 seconds]
<v0n>
Is there a way to not install the *-py3.9.egg-info directories?
ant_ has joined #yocto
Andrei[m] has joined #yocto
ncaidin_lf has quit [Quit: Client closed]
cody has joined #yocto
robbawebba has quit [Quit: WeeChat 3.0.1]
robbawebba has joined #yocto
bps has quit [Remote host closed the connection]
BCMM has joined #yocto
<paulg>
RP, if you happen to stumble by again, can you run this on your tesimage boot logs?
<paulg>
testimage$for i in `grep -l BUG qemu*` ; do grep -A 10 BUG $i | grep RIP: | head -n1 ; done
<paulg>
and why I'm looking at RP 's additional bootarg is the dentry free where you'll see...
<paulg>
/* if dentry was never visible to RCU, immediate free is OK */
<paulg>
we've been trying to brute force the answer out of this turd so far, and I figured it was about time I should look into what it is trying to tell us and make a better educated guess.
<zedd>
ahah
<paulg>
keyword is still "guess" at this point
<paulg>
my baremetal testing was obviously not using any funky rcu bootargs...
<paulg>
neither is 99.99993721347% of the rest of the world...
* zedd
nods
<RP>
paulg: I thought I'd removed that argument and still seen it :/
<paulg>
I guess if vmeson got a "bad" in his global bisect that predates the RP band-aid fix above, then that would vindicate RP.
<paulg>
...or that. :)
<RP>
paulg: do you still need that grep over the logs. I need to power the machine back :)
<RP>
It is far too hot here atm :/
<paulg>
RP, I was just trying to confirm the "dentry appears too close to the RIP every time
<paulg>
....every time" theroy.
<paulg>
I think ATM it is pretty hard to ignore that fact, so no rush to confirm that.
<RP>
paulg: I'd agree it seems related
<paulg>
I've got my local machine running the udelay hack and in a loop of 10 runs while I go do an errand ; will BBL 'cause this is pissing me off and I want it solved.
<wesm>
my git server uses gitolite and specifies repos with a colon, like <server>:<repo>. The git fetcher chokes on the : expecting an int port number
<wesm>
I specified protocol=ssh but that still tries to parse the port
<abelloni>
use a / instead of the :
<abelloni>
i.e. git://git@git.example.com/linux;protocol=ssh will fetch git@git.example.com:linux
<wesm>
abelloni: thank you, I didn't see that in the docs
<RP>
paulg: mailed you the output
* paulg
probably should check e-mail more than once a day....
<paulg>
heh. my "ran 50x overnight" local box died on the 1st run after adding the udelay hack-patch.