dl9pf changed the topic of #yocto to: Welcome to the Yocto Project | Learn more: http://www.yoctoproject.org | Join the community: http://www.yoctoproject.org/community | Channel logs available at https://www.yoctoproject.org/irc/ and https://libera.irclog.whitequark.org/yocto/ | Having difficulty on the list, or with someone on the list? Contact YP community mgr Nicolas Dechesne (ndec)
alejandr1 has joined #yocto
nerdboy_ is now known as nerdboy
nerdboy has joined #yocto
nerdboy has quit [Changing host]
sakoman has quit [Read error: Connection reset by peer]
jpuhlman__ has joined #yocto
jpuhlman_ has quit [Ping timeout: 258 seconds]
sakoman has joined #yocto
sakoman has quit [Quit: Leaving.]
camus has joined #yocto
georgem has quit [Quit: Connection closed for inactivity]
Vonter has quit [Ping timeout: 265 seconds]
camus1 has joined #yocto
camus has quit [Remote host closed the connection]
camus1 is now known as camus
paulg has quit [Ping timeout: 272 seconds]
davidinux1 has joined #yocto
davidinux1 is now known as davidinux
manuel_ has quit [Ping timeout: 246 seconds]
rob_w has joined #yocto
jonah1024 has joined #yocto
goliath has joined #yocto
Schlumpf has joined #yocto
camus1 has joined #yocto
camus has quit [Read error: Connection reset by peer]
camus1 is now known as camus
Guest32 has joined #yocto
<Guest32> Hi,
davidinux has quit [Ping timeout: 265 seconds]
<Guest32> I tried to upgrade the Yocto version from Dunfell to Hardknott. But after upgrading to hardknott I don't see the libpcre2 package under /usr/lib/. Below are the missing so files with hardknott installed:
mckoan|away is now known as mckoan
<mckoan> good morning
<Guest32> libpcre2-8.so.0
davidinux has joined #yocto
<Guest32> libpcre2-posix.so.2 libpcre2-8.so.0.9.0 libpcre2-posix.so.2.0.3
<Guest32> good morning
<Guest32> any input on this?
frieder has joined #yocto
Guest3216 has joined #yocto
Guest3216 has quit [Client Quit]
RKBH has joined #yocto
Guest32 has quit [Ping timeout: 246 seconds]
rfried has quit [Quit: The Lounge - https://thelounge.github.io]
rfried has joined #yocto
cquast has joined #yocto
florian has joined #yocto
zpfvo has joined #yocto
prabhakarlad has joined #yocto
manuel_ has joined #yocto
florian has quit [Ping timeout: 252 seconds]
tnovotny has joined #yocto
RKBH has quit [Quit: Client closed]
Schlumpf has quit [Quit: Client closed]
leon-anavi has joined #yocto
manuel_ is now known as Manuel1985
ant__ has quit [Remote host closed the connection]
Manuel1985 is now known as manuel1985
leonanavi has joined #yocto
leon-anavi has quit [Ping timeout: 258 seconds]
leonanavi is now known as leon-anavi
leon-anavi has quit [Client Quit]
leon-anavi has joined #yocto
Schlumpf has joined #yocto
leon-anavi has quit [Remote host closed the connection]
leon-anavi has joined #yocto
mihai has joined #yocto
<RP> paulbarker: that patchset makes things worse I'm afraid: https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2293
<RP> paulbarker: looks like it is on the older distros
mranostaj has quit [Remote host closed the connection]
mranostaj has joined #yocto
<paulbarker> RP: At least it's a quick failure now!
<paulbarker> RP: Is there a quick way to tell which python version bitbake is running under?
<paulbarker> As Debian 8 will be using the buildtools tarball I guess older distro actually means newer python
frieder has quit [Ping timeout: 268 seconds]
<kanavin> why is debian 8 builder even active still?
frieder has joined #yocto
<RP> paulbarker: right, a lot of those (all?) would be using buildtools, yes
<paulbarker> RP: I'll see if I can grab the latest buildtools tarball and run a build with that locally
<RP> paulbarker: you can see which one it is using from the helper
<RP> (it says which hosts at the end too(
<paulbarker> RP: Thank you! I'll grab that and set it up here
<paulbarker> It'll be on opensuse-15.3 but I think the issue here may be due to the Python version so as long as that matches I should hopefully see it fail
<paulbarker> If it all works fine I guess it's time for a Debian 8 VM, though that will take longer
<RP> paulbarker: I'm not sure what the trigger was but that seems the logical place to start...
<RP> paulbarker: could be as simple as something missing from buildtools :/
<paulbarker> RP: Just to confirm - is the hashserv instance running from the same commit of bitbake during these tests? Or is that running from a known-good commit?
<RP> paulbarker: the hashserve is a single instance autobuilder wide and unchanged during these tests -that would be upgraded separately as it is standalone
<RP> unless tests use a local one
<paulbarker> RP: Ok, that will narrow down where the failure could be
zyga-mbp has joined #yocto
gourve_l has quit [Ping timeout: 258 seconds]
zyga-mbp has quit [Read error: Connection reset by peer]
zyga-mbp has joined #yocto
gourve_l has joined #yocto
florian has joined #yocto
frieder has quit [Ping timeout: 272 seconds]
<kanavin> rburton, RP: zstd decompresses 10 times faster than xz. I'll look into switching rpm compression to that in 4.17 timeframe, as we could get drastically faster do_rootfs and do_populate_sdk from it.
<kanavin> (rpm 4.17 that is, currently in rc)
camus1 has joined #yocto
<kanavin> compression times are similar
<RP> kanavin: sounds nice! :)
<rburton> awesome
<kanavin> RP, rburton : I tested with 2.5 Gb tarball, 6 Gb uncompressed
camus has quit [Ping timeout: 272 seconds]
camus1 is now known as camus
<RP> rburton: I just did some stats collection. 18 ptest AB-INT failures, 8 only ever seen on arm host :/
<kanavin> xz took 110 seconds, zstd 10 seconds (!!!)
<rburton> nice!
<rburton> RP: ouch
<rburton> RP: load, i imagine? weaker host?
<RP> rburton: the stats are deceptive as where it occurred once on x86 I didn't mark as arm specific but the arm failures are much more frequent :/
<RP> rburton: I'm not sure of the cause, we did back off the load on the arm worker but it didn't seem to improve things
<RP> kanavin: that is pretty neat.
<RP> rburton: we need some kind of a plan for the ptest issues as they're about 40% of the open AB-INT issues
<rburton> glancing at the list i half debugging 14244 so i'll finish that off
<perdmann_> I want to create a Lib from the min protocol. ERROR: libmin-1.0-r0 do_install: oe_soinstall: libmin.so.1.0 is missing ELF tag 'SONAME'.
<rburton> your makefile is broken
frieder has joined #yocto
* RP has closed 11 of the AB-INT bugs, down to 46 of them now
<perdmann_> rburton: i dont have a makefile ... https://dpaste.org/Z2WR
<RP> rburton: in the interests of closing bugs - https://bugzilla.yoctoproject.org/show_bug.cgi?id=13999 - the remaining issue is overlap of files. Does the sstate code not detect that? Maybe it didn't due to the quoting issue?
<rburton> perdmann_: please write a makefile, and delete most of that recipe
<rburton> perdmann_: there's a perfectly good cmakelists in the repo you're cloning, why are you building by hand?
davidinux has quit [Ping timeout: 272 seconds]
<rburton> perdmann_: your recipe can most likely be just SRC_URI/S assignments, and inherit cmake
davidinux has joined #yocto
bunk has joined #yocto
<RP> rburton: hmm, you're right about the directory race :/
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<RP> rburton: code even says " # We can race against another package populating directories as we're removing them so we ignore errors here." :/
florian_kc has joined #yocto
<RP> that isn't enough :(
<rburton> what race?
<rburton> obviously, i'm always right
<RP> rburton: https://bugzilla.yoctoproject.org/show_bug.cgi?id=13999 - sstate can be running sstate_clean_manifest whilst another task extracts files
<RP> rburton: always :)
<rburton> oh that, yeah
<rburton> just bite the bullet and put a read/write lock on pkgdata
<rburton> or sstate in general
<RP> rburton: its a general sstate problem though :(
<RP> rburton: the read/write locks are painful on performance
<rburton> worth benchmarking though?
<RP> rburton: I have before a long time ago
<RP> rburton: imagine a build where all the setscene tasks end up serialised :/
zyga-mbp has joined #yocto
camus1 has joined #yocto
camus has quit [Read error: Connection reset by peer]
camus1 is now known as camus
<perdmann_> rburton: that cmake file does not create a so
<perdmann_> rburton: ohhh it is . i see... iam sorry. i will inherit cmake and try again, thank
<zedd> RP: 5.13 dropped, so I'm finalizing the libc-headers and reference recipes now ... I'll triple check that our LTP and rcu stall changes are there (I've been testing them on 5.13 already), since they won't be mainline quite yet.
<rburton> perdmann_: delete 99% of the recipe in the process, most of that recipe is redundant or actively hardful
<RP> zedd: thanks. We managed to close a lovely number of AB-INT bugs with those :)
<RP> zedd: 58 down to 46
<rburton> is that rcu lock the core problem? and now it will stall on load but not crash and die?
<zedd> awesome. and I hope the remaining are less annoying, :D hopefully no more kernel ones.
<zedd> but we obviously should document those as "why the yocto AB stress testing helps the world"
<RP> rburton: we'll get warnings now but not hangs, the hang was the problem
<rburton> right
<rburton> as the stalls are not massively unexpected when on heavy load, that's fine
<RP> rburton: exactly
<RP> we can live with the odd stall. Locking up the VM is antisocial though
<perdmann_> rburton: so i dont need SOFILE and these SO related stuff?
<RP> zedd: it means I can't ignore the "bitbake server timeout" issue for much longer :(
<RP> zedd: I'd rather debug the kernel than try and fix that :(
<rburton> perdmann_: no, that's all default
<rburton> and the insane skips were because you were building wrong
<RP> zedd: I've closed most of the qemu weirdness bugs on the basis we should reopen new ones with "good" data
<zedd> RP: indeed. I'm thinking it is even harder to reproduce for debugging as in the guts of things
<RP> zedd: I know what the bitbake server issue is. I could just increase the timeout as it is an IO problem. I just don't like doing that :/
<zedd> RP: yah, that's the most efficient way to get real ones to pop back up.
<RP> Really the whole bitbake server thing needs rewriting
<zedd> aha
<RP> zedd: torn between hacking around it or doing some nasty rewrite
* zedd is always tempted by rewrites :D
* RP remembers the large number of races we fixed in this code already
<perdmann_> rburton: thanks...
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
georgem has joined #yocto
zyga-mbp has joined #yocto
dmoseley has quit [Quit: ZNC 1.8.2 - https://znc.in]
argonautx has joined #yocto
<perdmann_> rburton: ok, i removed evything and added the line "inherit cmake"
<perdmann_> But then bitbake tells me its missing an install task, so i readded the install task but then i get some SONAME Error
<rburton> inherit cmake will provide an install task
<rburton> pastebin your recipe?
<perdmann_> rburton: of course
<rburton> maybe the cmakelists is broken too
<rburton> you just need to respect CC CPPFLAGS CFLAGS LDFLAGS etc,all in the environment
<rburton> cmake does that normally, but people can write bad cmakefiles that explicitly don't
<rburton> only so much you can do when people actively break stuff
<rburton> RP: think i fixed the util-linux one
dmoseley has joined #yocto
<rburton> perdmann_: you can remove FILESEXTRAPATHS and all your FILES_
<rburton> cmake.bbclass definitely has an install task
<rburton> unless the cmake doesn't have an install action, which is what you mean
<perdmann_> rburton: | ninja: error: unknown target 'install'
<rburton> yeah their cmake doesn't provide an install then
<perdmann_> rburton: yes, so do_install just calls this install section, which i dont have
<rburton> and they didn't set a soname in the library either
<perdmann_> Thats why it only build an .a file?
<rburton> oh if it only builds a .a then that's exactly why there's no soname, just install the .a
<rburton> not using the soinstall as that's for Shared Objects, not archives
<perdmann_> ok, i had the idea that i wanted to link that dynamical
<rburton> fix the cmakelist to build a shared library then
<perdmann_> with a patch?
<rburton> yeah
<rburton> easier, and you get to send it upstream too
<perdmann_> rburton: sounds like a good idea
<perdmann_> i will, i just need to find out how to do that in cmake
<rburton> iirc you just add SHARED in the build library statement
<rburton> RP: turns out util-linux ptest wasn't testing most of util-linux
<RP> rburton: Why am I not surprised :(
<rburton> this was a relatively recent change but it should have been spotted in ptest regressions
<perdmann_> rburton: yes. Lets see. :)
<rburton> ooh util-linux can now build with meson
<RP> rburton: I worry we don't handle regression tests correctly :(
<rburton> me too
<rburton> i think the lack of a decent machine readable format doesn't help
<rburton> qa should be able to generate a table of all the tests and their results
<perdmann_> rburton: it works
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<perdmann_> install still missing... whats your suggestion: CMake patch file for install or do_install
<RP> rburton: we have a machine readable format?
<rburton> well, sort of :)
<RP> rburton: the hard part is finding the one to compare against automatically
<rburton> you do quite often see mangled test names as it got all confused
<RP> rburton: they should at least get mangled consistently
zyga-mbp has joined #yocto
paulg has joined #yocto
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
zyga-mbp has joined #yocto
<RP> rburton: I think https://bugzilla.yoctoproject.org/show_bug.cgi?id=14379 is related too
davidinux has quit [Ping timeout: 268 seconds]
davidinux has joined #yocto
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
<perdmann_> rburton: IT WORKED! thanks a lot.
Schlumpf has quit [Quit: Client closed]
camus1 has joined #yocto
camus has quit [Ping timeout: 272 seconds]
camus1 is now known as camus
Falital has joined #yocto
<jonesv[m]> Is it not possible to have recipes point to closed git repos? I was hoping that bitbake would just try to use my user ssh key, but it appears it does not 😕
<jonesv[m]> Or maybe it does not work with a passphrase-protected key?
<jonesv[m]> <jonesv[m] "Or maybe it does not work with a"> oooh, if I use `ssh-agent` it works. It just does not want to ask for my password apparently
<rburton> yeah, agents work, escaping from several layers of abstraction to ask for a password less so
<rburton> also agents work in automated builds, asking a non-existent user to enter a password doesn't work well
otavio has quit [Remote host closed the connection]
sakoman has joined #yocto
otavio has joined #yocto
zyga-mbp has joined #yocto
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
tlwoerner has quit [Remote host closed the connection]
tlwoerner has joined #yocto
tnovotny has quit [Quit: Leaving]
sakoman has quit [Remote host closed the connection]
sakoman has joined #yocto
jonah1024 has quit [Quit: Connection closed for inactivity]
jonah1024 has joined #yocto
davidinux has quit [Ping timeout: 268 seconds]
davidinux has joined #yocto
BCMM has joined #yocto
manuel1985 has quit [Quit: Leaving]
<override> morning, can someone link me to a some systemd service recipe templates? About to write my first one, so need something to go off of.
<override> just a template that'll help me figure out what to inherit and all maybe
<override> thanks!
<override> mckoan: thanks!
<override> mckoan: whats a good way to figure out if ive got systemd enabaled by default on my image, as opposed to systemV?
zyga-mbp has joined #yocto
argonautx has quit [Ping timeout: 252 seconds]
<mckoan> override: seen from Yocto build point of vew or from the target system?
argonautx has joined #yocto
<override> build pov, mckoan:
<mckoan> override: if you have DISTRO_FEATURES_append = " systemd"
<override> got it, thanks!
frieder has quit [Remote host closed the connection]
<jonesv[m]> There is something I don't get yet. An image is a group of packages. I can define multiple different images, and build them with `bitbake flavor1-image` or `bitbake flavor2-image`, right? And I define flavor1 in say `meta-flavor1`, and flavor2 in say `meta-flavor2`. So in my bblayers.conf, I have both those layers included. And if they both contain a `bbappend` (say they both create a config file for hostapd), then those two bbappend conflict
<jonesv[m]> with each other.
<jonesv[m]> Is there a way to not have meta-flavor1 look into the meta-flavor2 layer?
<jonesv[m]> My guess is that they should be both in the same layer, say `meta-myproject`, and there I should define two images: `recipes-flavor1` and `recipes-flavor2`. But in my case, I have images that are quite different from each other (i.e. different projects), so they don't feel like they belong to the same layer. However, I don't want to checkout a new poky setup for each project, because then it will take a ton of disk space and I need to rebuild
<jonesv[m]> everything from scratch for each new project 😕
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
frieder has joined #yocto
frieder_ has joined #yocto
frieder has quit [Ping timeout: 265 seconds]
zpfvo has quit [Remote host closed the connection]
frieder_ has quit [Remote host closed the connection]
mckoan is now known as mckoan|away
florian_kc has quit [Quit: Ex-Chat]
* paulg is almost afraid to ask if the autobuilder is ok, or still spitting out random RCU implicated spews...
florian has quit [Ping timeout: 272 seconds]
Vineela has joined #yocto
jonah1024 has quit [Quit: Connection closed for inactivity]
<jonesv[m]> I tried to formalize my question here, if somebody is interested: https://stackoverflow.com/questions/68167244/image-specific-layers
<Tartarus> JPEW: Hey, mingw tangent question. Is SDK_ARCHIVE_TYPE expected to be set in local.conf ? It's not in BB_ENV_EXTRAWHITE_OE under scripts/oe-buildenv-internal
<Tartarus> or is tar.xz really just easy enough to work with in Windows these days it doesn't matter? I haven't shuffled + rebooted for this quick PoC I built yet :)
<JPEW> Tartarus: It should be set in local.conf... not sure if it should be in EXTRAWHITE
<Tartarus> OK, easy enough, thanks.
<JPEW> Last I check, tar.gz has some troubles on Windows, but TBH we don't use either that *or* zip and have a self extracting python file.... which I _still_ need to upstream
<override> anyone know what layer oe keeps nginx recipe under?
<Tartarus> JPEW: I was a little surprised there wasn't an installer like Linux, but only a little. Just doing a PoC for a quote for a customer atm anyhow
<RP> paulg: much happier
<JPEW> Tartarus: Ya. I wanted to do a unified replacement for the installer that used python so it would be the same on both MinGW and Linux
<JPEW> But still a work in progress
<RP> paulg: I closed about 12 open bugs on the basis that several were related...
<JPEW> Tartarus: The basic idea is to create a tar.gz file with the SDK contents, then use Python to extract it and do the pre/post processing
<JPEW> Tartarus: There is a pretty interesting trick that Python has where if the archive is a zip file, it will extract the contents to a temporary directory and execute __main__.py from them; this would allow you to efficiently package the SDK tar.gz with the extraction script in a single file
<Tartarus> Neat
<override> rbuton: in a service file for systemd, would something like Requires=nginx.service work, or would I have to use a vraiable or soemthing for nginix?
<override> the service im working with has a lot going on for nginx, so Im trying to how that would work
<paulg> RP, well that is good news.
<RP> paulg: yes, its made me a lot happier
Falital has quit [Ping timeout: 272 seconds]
<override> basically when I bring in nginx using a recipe, can I be writing services with stuff like Requires=nginx.service?
<paulg> RP, was that 12 just for RCU dain-bramage alone, or also including earlier LTP/cgroup wreckage?
<JPEW> override: Recipes with systemd support will list the services they install in the SYSTEMD_SERVICE variables
<JPEW> Hmm, I had several jobs timeout when trying to share DL_DIR over NFS. It looks like they all deadlocked trying to flock the .lock file
<JPEW> Anyone else see such a thing?
BCMM has quit [Ping timeout: 258 seconds]
creich has quit [Remote host closed the connection]
camus has quit [Read error: Connection reset by peer]
camus has joined #yocto
ant__ has joined #yocto
xmn has joined #yocto
camus1 has joined #yocto
camus has quit [Ping timeout: 268 seconds]
camus1 is now known as camus
<chrfle> Does anyone know if block devices which are not mounted are automatically synced upon reboot
Guest15 has quit [Quit: Client closed]
<rburton> if they're not mounted... how will they have pending writes?
<chrfle> rburton: e.g. dd if=my_fancy_firmware of=/dev/sda6
Spooster has joined #yocto
florian has joined #yocto
<marc1> chrfle: systemd will call sync() at shutdown which in turn will flush all fs and block devices caches, see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/sync.c#n123
<chrfle> marc1: do you know where in systemd that call is made?
<chrfle> systemd-shutdown I presume, will go have a look
yates has joined #yocto
davidinux has quit [Read error: Connection reset by peer]
davidinux has joined #yocto
<smurray> chrfle: one option there is to tell dd to use direct writes
mattofak has quit [Remote host closed the connection]
<marc1> chrfle: look at shutdown.c in SD sources
<chrfle> marc1: yeah, found it in async.c called from shutdown.c, thanks
BCMM has joined #yocto
Falital has joined #yocto
Falital has quit [Client Quit]
jpuhlman__ has quit [Quit: Leaving]
jpuhlman has joined #yocto
Vineela has quit [Quit: Leaving.]
Vineela has joined #yocto
zyga-mbp has joined #yocto
leonanavi has joined #yocto
leon-anavi has quit [Ping timeout: 272 seconds]
camus has quit [Ping timeout: 265 seconds]
camus has joined #yocto
zyga-mbp has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
davidinux has quit [Ping timeout: 268 seconds]
<Spooster> I added a kernel cfg fragment... and it looks like the recipe picked it up... and I see that it showed up in the workdir... aside from verifying the behavior by running the kernel... is there another way to verify that the new kernel was built with the options?
<Spooster> my fear is I just copied a random file that ends in .cfg, and it won't do anything
davidinux has joined #yocto
<smurray> Spooster: look at the .config in the kernel build directory under ${WORKDIR}?
<Spooster> I see my .cfg fragment, a file named defconfig.cfg, and a couple others popping up in ./build/tmp/work/raspberrypi4_64-poky-linux/linux-raspberrypi/1_5.4.72+gitAUTOINC+5d52d9eea9_154de7bbd5-r0
<Spooster> but I don't know enough about the meta-raspberrypi recipe to know if "that's it" or if that's the ${workdir}
<smurray> there's a build output directory under there, for linux-raspberrypi, it'll be something like linux-raspberrypi4_64-standard-build, in there will be the final .config
rob_w has quit [Read error: Connection reset by peer]
<RP> paulg: covered both but ltp was 2-3 of that total
<RP> abelloni: looks like there is an arm ltp hang again
<RP> paulbarker: Looks like a prserver hang: https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/2268 :/
<RP> paulbarker: is there any debug we want from that?
<Spooster> +1 smurray tyvm. Found and confirmed what I was hoping to see
goliath has quit [Quit: SIGSEGV]
<paulbarker> RP: Damn. I have no idea where it is hanging now if there's no backtrace at all. May be worth a look at the cookerdaemon log at least
leonanavi has quit [Remote host closed the connection]
leonanavi has joined #yocto
florian has quit [Ping timeout: 272 seconds]
<RP> paulbarker: no traceback, last command completed successfully, last command looked to be "bitbake -R conf/prexport.conf -p"
camus has quit [Ping timeout: 272 seconds]
camus has joined #yocto
<RP> paulbarker: that looks like something bitbake-prserv-tool would run
<paulbarker> RP: Ok, so that bitbake command completed successfully but the corresponding test (likely test_import_export_override_db) never finished
<RP> paulbarker: The cooker log says the command completed, I'm not sure the server exits
<RP> paulbarker: agreed, yes
<RP> $ ps ax | grep prser
<RP> 1441773 ? S 0:00 /bin/sh -c bitbake-prserv-tool export /home/pokybuild/yocto-worker/oe-selftest-ubuntu/build/build-st-507461/export.inc
<paulbarker> So it's probably stuck waiting for the server to shutdown, somewhere where there is no timeout
<RP> paulbarker: it kind of looks like bitbake's main loop thinks something is still active
<paulbarker> RP: Is there any way to figure out which bitbake pid is the prservice server? Maybe run `lsof` with the path to prserv.sqlite3
<RP> sh(1441773)───bash(1441776)───KnottyUI(1443232)───{KnottyUI}(1444017)
<RP> paulbarker: pstree -p 1441773
<RP> paulbarker: what looks bad is that there are a ton of parser worker zombie processes
<RP> 1444977 ? Z 0:03 [Parser-2] <defunct>
<RP> 1444980 ? Z 0:03 [Parser-3] <defunct>
<RP> but 58 of them
<paulbarker> Ouch
<paulbarker> So I'm guessing the main bitbake process is stuck at http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/bitbake/lib/prserv/serv.py?h=master-next#n365
<RP> paulbarker: I can see 1444905 is the prserv (or it at least has the sqlite open)
<paulbarker> If you kill that pid I'd like to see if the bitbake server shuts down cleanly
<RP> so you want me to kill it?
<paulbarker> Yes just that pid
<RP> paulbarker: now also a zombie
Spooster has quit [Remote host closed the connection]
Spooster has joined #yocto
<paulbarker> Well that's disappointing
florian has joined #yocto
Spooster has quit [Ping timeout: 258 seconds]
<paulbarker> RP: So I started with the assumption that the way cooker spawns hashserv (http://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/bitbake/lib/bb/cooker.py#n389) is well validated
<paulbarker> But that doesn't get exercised on the autobuilder as it uses a separate hashserv daemon
<RP> paulbarker: correct
<paulbarker> That code creates the asyncio loop in the main process then runs it (??) in a subprocess
<paulbarker> I wonder if the next step is to rip that out for prserv so the subprocess is started first then all the prserv work (opening database, initialising asyncio loop, etc) occurs within the subprocess
<RP> paulbarker: this is what we use to do as it is hard to ensure the subprocesses don't hold the wrong resources. Its not very pythonic though :/
<RP> paulbarker: what is odd is that there is one parser thread that is still "alive" :/
<paulbarker> Hanging code isn't very pythonic either haha
cquast has quit [Ping timeout: 268 seconds]
<RP> paulbarker: I get a lot of complaints about the fact we use old fashioned fork() calls ;-)
<RP> but yes, I like old/simple in many ways for this reason
<paulbarker> What does bother me is that I've never been able to replicate the issue here. I guess it's due to a lower level of parallelism
<RP> the autobuilder does seem to find things at scale that most people don't :/
<paulbarker> I tried to run oe-selftest in parallel but it triggered the OOM killer
<paulbarker> `oe-selftest -j12 ...`, BB_NUMBER_THREADS=12 and PARALLEL_MAKE=-j12.
<paulbarker> Eat through 64GB RAM, 8GB swap and then the kernel started chomping processes
<paulbarker> I saw a load average >700 on this 6 core/12 thread machine
<RP> paulbarker: ok, I installed python3-dbg and we have some backtraces on the processes. Let me try and dump this into an email
Vineela has quit [Ping timeout: 272 seconds]
florian has quit [Ping timeout: 252 seconds]
<RP> paulbarker: I've mailed it over to you. It looks to me like when bitbake forks off the parser worker threads, the worker threads are inheriting the asyncio in progress from the parent :/
<paulbarker> RP: Ah that would definitely break everything!
argonautx has quit [Quit: Leaving]
florian has joined #yocto
<RP> paulbarker: looking more closely I'm wrong about that. It is a parser thread sitting in async clinent connection code
camus has quit [Read error: Connection reset by peer]
camus1 has joined #yocto
<paulbarker> RP: I'll take a look at those dumps tomorrow
Spooster has joined #yocto
camus1 is now known as camus
<RP> paulbarker: its sitting in prserv_dump_db() but since we killed the server now, I'm not sure what it would do. It is data at least, happy to have the stack traces
florian has quit [Ping timeout: 258 seconds]
leonanavi has quit [Quit: Leaving]
<RP> abelloni, rburton: with the arm worker ltp bug, I ssh'd in and it was stuck on proc01. I installed strace, attached to the stuck process and it unblocked it and everything started running again
<RP> I lost the log as it scrolled off my terminal buffer :(
<RP> it is reading /proc/kmsg
<jonesv[m]> hmm I thought I could use `${IMAGE_BASENAME}` to enable my bbappend only for a specific image, but that does not seem possible... (details here: https://stackoverflow.com/questions/68167244/image-specific-layers)
<abelloni> yeah, so proc01 is an issue on arm
<abelloni> and it is blocked on a read
camus1 has joined #yocto
camus has quit [Ping timeout: 268 seconds]
camus1 is now known as camus
<jonesv[m]> Would it make sense to add a COMPATIBLE_IMAGE variable, similar to COMPATIBLE_MACHINE or COMPATIBLE_HOST, that would allow me to write bbappends that are ignored on incompatible images?
Vineela has joined #yocto
fullstop_ has joined #yocto
fullstop has quit [Ping timeout: 244 seconds]
fullstop_ is now known as fullstop
abelloni has quit [Ping timeout: 244 seconds]
abelloni has joined #yocto
xantoz has quit [Ping timeout: 244 seconds]
xantoz has joined #yocto