#scopehal on 2024-01-29 — irc logs at libera.irclog.whitequark.org

2023-10-21 05:40 azonenberg changed the topic of #scopehal to: ngscopeclient, libscopehal, and libscopeprotocols development and testing | https://github.com/ngscopeclient/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal

00:27 <cyborg_ar> huh interesting

00:27 <cyborg_ar> my scope doesnt have per-channel bandwidth limiting, it is a global setting for the whole scope

00:29 <d1b2> <david.rysk> @azonenberg @johnsel without very much caching a GH hosted CI run takes about 20min

00:29 <d1b2> <david.rysk> Will have best-casae numbers with caching soon

00:29 <d1b2> <david.rysk> best-case*

00:35 <azonenberg> cyborg_ar: innnteresting

00:35 <azonenberg> file a ticket, thats not something we can cleanly express in our current api

00:36 <cyborg_ar> for now im just ignoring it, i dont normally use it. for you this scope is DC anyway, 150MHz :P

00:36 <azonenberg> we have worked with some instruments where a particular setting was either per channel or global based on the model, but have never seen that for bandwidth limit

00:36 <azonenberg> yeah but still good to know this is a thing that exists

00:37 <cyborg_ar> ugh i wish parsing strings in C++ wasnt such a pain. im so spoiled with python...

00:39 <cyborg_ar> kind of bothers me the driver is doing its won parsing with sscanf, i should be able to pass the transport a format spec and get back a parsed response back, with built in error handling. here most calls with sscanf are unguarded so a bad response can mess things up or crash

00:43 <azonenberg> Yeah improving robustness would be nice to do. but I'm less worried about input parsing bugs in this particular context

00:43 <azonenberg> you're talking to hopefully trusted hardware that should be on an isolated network

00:43 <azonenberg> like, anyone who can tamper with the probably-unencrypted-and-unauthenticated can change your power supply voltage and fry your DUT

00:43 <azonenberg> being able to segfault ngscopeclient with malformed traffic is the least of your worries in that situation

00:44 <azonenberg> making it more robust to user error would certainly be nice, i'm absolutely open to improvements in that regard

00:45 <cyborg_ar> for pure SCPI instruments some sort of auto-probing would be nice too, since each driver should be able to know if they can drive an instrument for a given *IDN? response. non-scpi instruments of course dont get that

00:47 <azonenberg> Yes. That is on the distant wishlist

00:47 <azonenberg> some way to connect to an unknown transport and figure out what driver to invoke on it

00:48 <azonenberg> Also improving robustness if you accidentally select the wrong driver. generally more exceptions and clean error handling will be nice to have eventually

00:48 <cyborg_ar> ah yeah whats the policy on exceptions here?

00:49 <azonenberg> Not currently used, except by the Vulkan wrapper library and I think the YAML parser

00:49 <cyborg_ar> C++ exceptions are kinda ugly so some projects dont even use them

00:49 <cyborg_ar> ah ok

00:49 <azonenberg> but we are probably going to start using them as it seems better than the way we do things now

00:49 <azonenberg> in particular for instrument creation failures

00:49 <azonenberg> theres no good way to get feedback that a connection failed or a driver couldn't initialize in the curent model

00:50 <azonenberg> Backstory: this code is the descendant of something i wrote during my thesis to "just get my work done" and only went public in a form that was buildable by other people (no hard coded paths etc) circa 2020

00:50 <azonenberg> and it took a while for momentum to build and more people to get interested in working with it

00:50 <azonenberg> So there's a fair bit of legacy cruft that needs cleaning out

00:50 <azonenberg> the frontend is generally nicer code because it was completely rewritten recently

00:51 <cyborg_ar> yeah i noticed the program would just crash or your would get a partially initialized instrument

01:01 <_whitenotifier-e> [scopehal-apps] d235j opened pull request #677: Prevent glfw init for tests as glfw requires an X/Wayland server - https://github.com/ngscopeclient/scopehal-apps/pull/677

01:39 Degi has quit [Ping timeout: 276 seconds]

01:40 Degi has joined #scopehal

03:32 <azonenberg> @johnsel i see some CI jobs queued for debian that arent running, any idea whats up with that? something you're debugging?

03:33 <_whitenotifier-3> [scopehal-apps] azonenberg closed pull request #677: Prevent glfw init for tests as glfw requires an X/Wayland server - https://github.com/ngscopeclient/scopehal-apps/pull/677

03:33 <_whitenotifier-e> [scopehal-apps] azonenberg pushed 2 commits to master [+0/-0/±6] https://github.com/ngscopeclient/scopehal-apps/compare/0f265a9dd045...cc3acb12144c

03:33 <_whitenotifier-3> [scopehal-apps] azonenberg cc3acb1 - Merge pull request #677 from d235j/headless-tests Prevent glfw init for tests as glfw requires an X/Wayland server

03:42 balrog has quit [Quit: Bye]

03:45 balrog has joined #scopehal

04:02 <_whitenotifier-3> [scopehal-apps] d235j synchronize pull request #673: Cmake cleanups - https://github.com/ngscopeclient/scopehal-apps/pull/673

04:08 <_whitenotifier-e> [scopehal-apps] d235j synchronize pull request #673: Cmake cleanups - https://github.com/ngscopeclient/scopehal-apps/pull/673

04:18 sgstair has quit [Read error: Connection reset by peer]

04:52 sgstair has joined #scopehal

07:09 bvernoux has joined #scopehal

07:10 <_whitenotifier-e> [scopehal] azonenberg bfdbe24 - I2CDecoder: removed ascii output since we now have hexdump and ascii output in the protocol analyzer directly from the binary data

07:10 <_whitenotifier-3> [scopehal] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/ngscopeclient/scopehal/compare/1c5994a1fda2...bfdbe24c267c

07:11 <d1b2> <johnsel> yes my mistake. I was working on Windows but fell asleep sitting behind the PC

07:12 <d1b2> <johnsel> my fiancee delegated me to bed but that left the CI in non-operating

07:12 <azonenberg> ok just making sure you're aware :)

07:12 <d1b2> <johnsel> no problem, thank you

07:12 <azonenberg> its not blocking me right now

07:12 <d1b2> <johnsel> I'll fix it in a bit

07:13 <d1b2> <johnsel> For some reason the console for the Windows CI machine has frozen

07:13 <d1b2> <johnsel> I want look at that first

07:20 <azonenberg> also not sure if you saw but i merged fixes from @david.rysk that should make the unit tests run under our CI with no x server

07:21 <azonenberg> i.e. vulkan init will now succeed

07:21 <azonenberg> we do want x on them eventually so that we can do GUI tests but that's a ways out

07:21 <d1b2> <johnsel> that's not super ideal as I used those to test but I will adjust then

07:22 <d1b2> <johnsel> I'd prefer if we had a make test-headless option or something

07:23 <azonenberg> what do you mean "you used those to test"

07:23 <azonenberg> the glfw init errors?

07:23 <d1b2> <johnsel> the unit tests

07:23 <azonenberg> this doesnt disable the x server or anything if you had one running

07:23 <azonenberg> all it does is fix the broken init

07:24 <azonenberg> i.e. tests should run and pass on our infra

07:24 <d1b2> <johnsel> if(!VulkanInit(true)) exit(1);

07:25 <d1b2> <johnsel> so it just skips a job if it can't init?

07:25 <d1b2> <johnsel> that's fine

07:25 <azonenberg> the change was VulkanInit() to VulkanInit(true)

07:25 <azonenberg> which does not try to initialize glfw (this needs a running x server)

07:25 <azonenberg> before doing the vulkan init (which is fine headless)

07:26 <d1b2> <johnsel> right, then I need to start running ngscopeclient itself to test further

07:26 <d1b2> <johnsel> that was my point

07:26 <azonenberg> ah ok

07:27 <azonenberg> Did not realize that would mess up your workflow

07:27 <d1b2> <johnsel> It's not a huge deal but the CI context isn't necessarily the context I get when I ssh into the machine

07:40 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 6 commits to master [+0/-0/±26] https://github.com/ngscopeclient/scopehal-apps/compare/cc3acb12144c...3fa026f6f92f

07:40 <_whitenotifier-e> [scopehal-apps] azonenberg 4dfc8e8 - Initial work towards protocol analyzer culling

07:40 <_whitenotifier-3> [scopehal-apps] azonenberg 24ad97b - Merge branch 'master' of github.com:ngscopeclient/scopehal-apps

07:40 <_whitenotifier-e> [scopehal-apps] azonenberg 9a6026e - Updated submodules

07:40 <_whitenotifier-3> [scopehal-apps] ... and 3 more commits.

07:40 <azonenberg> Finally finished

07:40 <azonenberg> This was a massive, like literal order of magnitude difference in framerate, improvement on my test dataset

07:41 <azonenberg> went from "barely possible to use but super annoying and jerky" to butter smooth 60 FPS

07:41 <azonenberg> on a 1-minute-long bus capture with ~75K packets in the analyzer view

07:41 <azonenberg> just in time too, monday starts in 15 minutes and it's a work day

07:42 <azonenberg> and i really didnt want to have to put up with this lag for another work day :p

07:45 <d1b2> <246tnt> For some reason I still can't build master : CMake Error at lib/scopehal/CMakeLists.txt:263 (target_link_libraries): Target "scopehal" links to: Vulkan::glslang but the target was not found. Possible reasons include: * There is a typo in the target name. * A find_package call is missing for an IMPORTED target. * An ALIAS target is missing.

07:47 <d1b2> <azonenberg> @david.rysk ^

07:48 <d1b2> <246tnt> (This is without david's rework of build. With his cmake-cleanup branch it works fine)

07:48 <d1b2> <johnsel> I wouldn't worry about it then, that'll get integrated soon

07:50 <azonenberg> ah ok

07:50 <azonenberg> also hmm it seems my culling rework broke the "select closest packet when you move the cursor" workflow

07:50 <azonenberg> unless the cursor happens to be over a packet in the visible (non culled) part of the list

07:54 <d1b2> <johnsel> azonenberg can you ping 10.2.14.10 ?

07:55 <azonenberg> no

07:55 <d1b2> <johnsel> I'm pretty sure I used to have RDP access to that stupid box

07:55 <d1b2> <johnsel> it thinks it has an ip but is also not connecte

07:55 <d1b2> <johnsel> and console has 0.05fps

07:56 <d1b2> <johnsel> it might be the e1000 adapter but I can't template the RTL one

07:56 <d1b2> <johnsel> grrrr

07:56 <d1b2> <johnsel> sometimes I wish we went with VSphere instead

07:56 <d1b2> <johnsel> would've been done by now

07:58 <d1b2> <johnsel> great it crashed

08:20 <d1b2> <johnsel> azonenberg can you check the logs?

08:20 <d1b2> <johnsel> this new vm won't start

08:20 <d1b2> <johnsel> it says I should check the logs

08:20 <d1b2> <johnsel> oh now it did start

08:27 <d1b2> <johnsel> so strange this, it's also extremely slow to stop the vm or start it

08:27 <d1b2> <johnsel> almost as if things are going wrong on the xen level

08:30 <d1b2> <johnsel> ok good

08:30 <d1b2> <johnsel> @azonenberg can you pull a new template?

08:31 <d1b2> <246tnt> Oh wow, just got a fail notif for the workflow run on github from my PR ... it took 15h to fail ...

08:32 <d1b2> <johnsel> my fault

08:32 <d1b2> <johnsel> most likely

08:32 <d1b2> <johnsel> the CI wasn't operating

09:20 <d1b2> <johnsel> @azonenberg when you have a moment please pull a template from https://xoa2.lab.poulsbo.antikernel.net/#/vms/105c5367-c936-56d4-9fd4-f7b03c4ad22c/advanced

09:21 <d1b2> <johnsel> Hopefully the issue is resolved now. For some reason when I create a vm from that template it'll be super slow and BSOD. When I create it through the GUI everything is fine. So annoying. I remember running into this before, so I may need to re-build that template as well perhaps

09:21 <d1b2> <johnsel> anyway config/setup wise I think we're mostly good

09:21 <d1b2> <johnsel> just got to figure out the x server situation on the linux ci box

09:39 <d1b2> <johnsel> but please make sure there is a GPU assigned to it as well. I didn't get one in the last session so it might have lost the association

15:52 <_whitenotifier-3> [scopehal] d235j opened issue #847: Untangle locale-setting code - https://github.com/ngscopeclient/scopehal/issues/847

15:55 * azonenberg sits down at desk for the morning and goes through backlog

15:55 <d1b2> <johnsel> it's just me complaining and asking for templates

15:56 <d1b2> <johnsel> 🙂

15:57 <_whitenotifier-3> [scopehal] azonenberg commented on issue #847: Untangle locale-setting code - https://github.com/ngscopeclient/scopehal/issues/847#issuecomment-1915006176

15:57 <_whitenotifier-3> [scopehal] azonenberg assigned issue #847: Untangle locale-setting code - https://github.com/ngscopeclient/scopehal/issues/847

16:12 <cyborg_ar> https://github.com/eliaskosunen/scnlib ooo

16:13 <azonenberg> ooh looks interesting

16:15 <azonenberg> ok lets see templates templates...

16:15 <azonenberg> There is a GPU on that instance

16:15 <d1b2> <johnsel> weird

16:15 <azonenberg> Templating it now

16:15 <d1b2> <johnsel> thanks

16:15 <d1b2> <johnsel> I suspect I'm going to ask that many more times of you

16:15 <d1b2> <johnsel> unfortunately

16:16 <azonenberg> done and re added to the set

16:16 <azonenberg> (at some point we'll probably want to clean up some disk deleting old templates but its not an immediate rush)

16:16 <d1b2> <johnsel> yep I was planning to do once we have everything 100% functioning

16:16 <d1b2> <johnsel> as I may need to revert to an older template

16:16 <d1b2> <johnsel> for esoteric reasons

16:17 <d1b2> <johnsel> sometimes a template gets messed up in a way I can't really explain

16:17 <azonenberg> yeah

16:17 <d1b2> <johnsel> the new vms generated from it just don't work

16:17 <azonenberg> i will probably do a bit of a cleanup of templates at that time too to remove things we are never ever going to use

16:17 <d1b2> <johnsel> even though they should

16:17 <azonenberg> like debian jessie

16:17 <azonenberg> or oracle linux 7

16:17 <azonenberg> lol

16:18 <d1b2> <johnsel> sure, I found a way to recover the system templates with a built-in script would we ever need to

16:18 <azonenberg> oh in that case let me purge a bunch of system tempaltes now, leaving all your old ones

16:18 <azonenberg> just to make the list less cluttered

16:19 <d1b2> <johnsel> sure, go loose

16:19 <d1b2> <johnsel> I'll see if the stupid template works

16:20 <d1b2> <johnsel> oh did you remove my permissions to the templates for my resource set?

16:21 <azonenberg> not to my knowledge?

16:22 <d1b2> <johnsel> strange, I don't see any templates anymore

16:22 <d1b2> <johnsel> I don't need them back but it is weird that they're gone

16:22 <azonenberg> maybe i have to add perms on the templates separately? let me check fro the latest

16:23 <d1b2> <azonenberg> https://cdn.discordapp.com/attachments/776941750291267595/1201563368138481775/perms.png?ex=65ca4616&is=65b7d116&hm=98b1ca24be23411ecb7a28f6d610d71750d83be9ac2eeb17cdf54a76195c94d6&

16:26 <d1b2> <johnsel> oh you made a scopehal-ci user

16:26 <d1b2> <johnsel> can you send me the password for it?

16:26 <d1b2> <johnsel> also, that is not my resource set

16:26 <d1b2> <johnsel> that's ci

16:27 <azonenberg> thats not a user thats a group

16:27 <azonenberg> you are currently the sole member of the group

16:27 <d1b2> <johnsel> oh

16:27 <azonenberg> there are no templates on your personal set

16:27 <azonenberg> what did you want?

16:27 <d1b2> <johnsel> I don't need anything

16:28 <azonenberg> ah ok

16:28 <d1b2> <johnsel> just surprised they were gone

16:28 <d1b2> <johnsel> I thought xoa was acting up

16:28 <d1b2> <johnsel> but no I don't plan to run anything else

16:28 <azonenberg> i didnt think i ever added any there

16:28 <azonenberg> ok

16:29 <d1b2> <johnsel> I had a debian template for sure, because that's what the znc hosts

16:29 <azonenberg> well its back now

16:29 <azonenberg> idk

16:30 <d1b2> <johnsel> lots of strangeness with xoa once you go past the most basic of use

16:32 <d1b2> <johnsel> nope

16:32 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1201565442456690709/image.png?ex=65ca4805&is=65b7d305&hm=1cd383bc0a1287ac6e94e4c3f47ebe73f9fcdef0fceef2a7e63a94688ac64f35&

16:32 <d1b2> <johnsel> find the weirdness

16:32 <d1b2> <azonenberg> lol weird

16:32 <d1b2> <johnsel> it's slow as fuck too

16:32 <d1b2> <azonenberg> seems to be running though i see a console

16:33 <d1b2> <johnsel> how slow is that console for you?

16:33 <d1b2> <johnsel> try navigating

16:33 <d1b2> <azonenberg> i just opened the console and saw the background image drawing scanline by scanline

16:33 <d1b2> <azonenberg> lol

16:33 <d1b2> <johnsel> I'm pretty sure it will BSOD in a bit

16:33 <d1b2> <johnsel> now that would be weird on it's own

16:33 <d1b2> <johnsel> but the weirdest part is if I use the GUI to create it

16:33 <d1b2> <johnsel> 0 issues

16:33 <d1b2> <azonenberg> wtf

16:34 <d1b2> <johnsel> right

16:34 <d1b2> <johnsel> that's what I am thinking

16:35 <d1b2> <johnsel> so I have been playing find the difference with the config

16:35 <d1b2> <johnsel> I thought nested virtualization was the issue at first

16:35 <d1b2> <johnsel> but with it enabled now it has the same behavior

16:35 <d1b2> <johnsel> I really don't understand why

16:35 <d1b2> <johnsel> or what the difference is

16:36 <d1b2> <johnsel> there's some slight difference in the disk configuration

16:36 <d1b2> <azonenberg> using ci-ceph-sr for both?

16:36 <d1b2> <johnsel> yep

16:38 <d1b2> <azonenberg> possibly related: there's a qemu-dm-51 instance in dom0 using 80% CPU

16:38 <d1b2> <johnsel> nice

16:39 <d1b2> <johnsel> the fuck

16:39 <d1b2> <johnsel> rdp is fast

16:39 <d1b2> <azonenberg> so local console only? that suggests gpu or gpu driver issue maybe

16:39 <d1b2> <johnsel> well

16:39 <d1b2> <johnsel> the gpu is not visible

16:39 <d1b2> <johnsel> I don't know if you have one on an instance

16:40 <d1b2> <azonenberg> i show a passthrough gpu on it

16:41 <d1b2> <azonenberg> what kind i cant see from the gui

16:41 <d1b2> <azonenberg> dont know how to tell if its attached or not

16:41 <d1b2> <johnsel> hmm

16:41 <d1b2> <azonenberg> it has an rtl8139 not an e1000 if that's related?

16:41 <d1b2> <johnsel> xe vgpu-list

16:41 <d1b2> <johnsel> I changed that to test

16:41 <d1b2> <johnsel> and because the driver prevents a memory security feature in windows from being enabled

16:42 <d1b2> <johnsel> probably the linux ci

16:42 <d1b2> <johnsel> btw

16:43 <d1b2> <azonenberg> there is a long list of uuids and no obvious human readable linkage to pci ids or vms i can see

16:43 <d1b2> <johnsel> xe vgpu-type-list

16:44 <d1b2> <azonenberg> uuid ( RO) : 70408f02-546d-4f40-87cb-d30649eeb34c vendor-name ( RO): model-name ( RO): passthrough max-heads ( RO): 0 max-resolution ( RO): 0x0

16:44 <d1b2> <azonenberg> very useful :p

16:46 <azonenberg> Also reminder everyone, dev call is in just over two hours. I'll drop a zoom link in here shortly

16:48 <d1b2> <johnsel> The GPU objects can be listed with the standard object listing commands: xe pgpu-list, xe gpu-group-list, and xe vgpu-list. The parameters can be manipulated with the standard parameter commands. For more information, see Low-level parameter commands.

16:48 <d1b2> <johnsel> https://docs.xenserver.com/en-us/citrix-hypervisor/command-line-interface.html#gpu-commands

16:48 <d1b2> <johnsel> please mess around with that to see if you can figure out what it's doing

16:52 <azonenberg> When I get a chance probably wont be until after work

16:52 <azonenberg> i'm busy debugging a specific filter i'm trying to use for actual client stuff :p

16:54 <d1b2> <johnsel> sure, I hope you can figure it out

17:02 <_whitenotifier-3> [scopehal] d235j synchronize pull request #841: Refactor of Cmake scripts - https://github.com/ngscopeclient/scopehal/pull/841

17:41 <_whitenotifier-3> [scopehal] azonenberg bad207f - BusHeatmapFilter: made x/y bin size and range configurable

17:41 <_whitenotifier-e> [scopehal] azonenberg pushed 2 commits to master [+0/-0/±4] https://github.com/ngscopeclient/scopehal/compare/bfdbe24c267c...e849e306c68f

17:41 <_whitenotifier-3> [scopehal] azonenberg e849e30 - Fixed two filters that included scopeprotocols.h forcing them to be recompiled needlessly when any other filter header was changed

17:51 <azonenberg> cyborg_ar: https://github.com/ngscopeclient/scopehal/issues/103 btw

17:52 <azonenberg> if you want to add any commentary re exceptions there

17:52 <azonenberg> this has been a pending "we'll do it eventually" for a long time

17:57 <_whitenotifier-e> [scopehal] azonenberg edited issue #660: Replace FFTS with vkFFT In all library (non-test) code - https://github.com/ngscopeclient/scopehal/issues/660

17:58 <_whitenotifier-e> [scopehal] azonenberg commented on issue #660: Replace FFTS with vkFFT In all library (non-test) code - https://github.com/ngscopeclient/scopehal/issues/660#issuecomment-1915273755

17:58 <_whitenotifier-3> [scopehal] azonenberg commented on issue #843: Refactor all shaders to handle Vulkan implementations with 64K max thread blocks in X axis - https://github.com/ngscopeclient/scopehal/issues/843#issuecomment-1915274475

17:58 <_whitenotifier-e> [scopehal] azonenberg closed issue #843: Refactor all shaders to handle Vulkan implementations with 64K max thread blocks in X axis - https://github.com/ngscopeclient/scopehal/issues/843

18:04 <_whitenotifier-3> [scopehal-apps] d235j commented on issue #422: Eye masks not installed by installer - https://github.com/ngscopeclient/scopehal-apps/issues/422#issuecomment-1915284347

18:05 <_whitenotifier-e> [scopehal-apps] azonenberg assigned issue #422: Eye masks not installed by installer - https://github.com/ngscopeclient/scopehal-apps/issues/422

18:05 <_whitenotifier-e> [scopehal-apps] azonenberg assigned issue #250: Add unit testing to CI builds - https://github.com/ngscopeclient/scopehal-apps/issues/250

18:05 <_whitenotifier-e> [scopehal-apps] azonenberg assigned issue #240: Add static analysis to CI builds - https://github.com/ngscopeclient/scopehal-apps/issues/240

18:05 <_whitenotifier-e> [scopehal-apps] azonenberg assigned issue #180: Look into static analysis options - https://github.com/ngscopeclient/scopehal-apps/issues/180

18:06 <_whitenotifier-e> [scopehal] d235j commented on issue #273: Refactoring: replace snprintf and sscanf with sto* and to_string() where practical - https://github.com/ngscopeclient/scopehal/issues/273#issuecomment-1915287773

18:06 <_whitenotifier-3> [scopehal-apps] d235j commented on issue #468: Add SwiftShader to github actions config - https://github.com/ngscopeclient/scopehal-apps/issues/468#issuecomment-1915288548

18:07 <_whitenotifier-3> [scopehal-apps] azonenberg commented on issue #471: Look at alternative CI platforms - https://github.com/ngscopeclient/scopehal-apps/issues/471#issuecomment-1915288753

18:07 <_whitenotifier-3> [scopehal-apps] d235j edited a comment on issue #468: Add SwiftShader to github actions config - https://github.com/ngscopeclient/scopehal-apps/issues/468#issuecomment-1915288548

18:07 <_whitenotifier-3> [scopehal-apps] azonenberg assigned issue #468: Add SwiftShader to github actions config - https://github.com/ngscopeclient/scopehal-apps/issues/468

18:07 <_whitenotifier-e> [scopehal-apps] azonenberg closed issue #471: Look at alternative CI platforms - https://github.com/ngscopeclient/scopehal-apps/issues/471

18:10 <_whitenotifier-e> [scopehal-apps] azonenberg closed issue #558: ngscopeclient: statistic support - https://github.com/ngscopeclient/scopehal-apps/issues/558

18:10 <_whitenotifier-3> [scopehal-apps] azonenberg commented on issue #558: ngscopeclient: statistic support - https://github.com/ngscopeclient/scopehal-apps/issues/558#issuecomment-1915294092

18:15 <_whitenotifier-3> [scopehal-apps] azonenberg added user d235j - https://github.com/d235j

18:15 <_whitenotifier-3> [scopehal] azonenberg added user d235j - https://github.com/d235j

18:18 <azonenberg> https://us06web.zoom.us/j/86404413156?pwd=ZqGfkW2wNMgUKekc0FF7phthmZJXA4.1 Here's the link for the developer call, starting in ~40 minutes

18:21 <d1b2> <azonenberg> (reminder @david.rysk @johnsel @hansemro @246tnt @louis8374 @bvernoux @aleksorsist )

18:21 <azonenberg> cyborg_ar: ^

18:21 <d1b2> <azonenberg> and @lainpants

18:22 <d1b2> <azonenberg> and @miek__ if we end up talking gpib

19:02 <d1b2> <johnsel> zoom is acting up for me

19:03 <d1b2> <azonenberg> can you hear us? there's no video at the moment

19:03 <d1b2> <johnsel> yep

19:05 <d1b2> <aleksorsist> Same here, can't hear

19:06 <d1b2> <aleksorsist> Ok, should be good now

19:41 <bvernoux> I have missed the call I was finishing to repair sound card of my wife (jack broken ...)

19:42 <d1b2> <johnsel> We are still in

19:42 <d1b2> <johnsel> If you want to join still

19:50 <azonenberg> Just finished up

19:59 <d1b2> <david.rysk> Another PR to homebrew...

20:00 <d1b2> <johnsel> nice, did you push your matrix build changes to gh yet btw?

20:00 <d1b2> <johnsel> I don't think I've seen them

20:00 <d1b2> <david.rysk> yes in my fork

20:01 <d1b2> <david.rysk> https://github.com/d235j/scopehal/tree/ci-work

20:01 <d1b2> <david.rysk> er

20:01 <d1b2> <david.rysk> https://github.com/d235j/scopehal-apps/tree/ci-work

20:02 <d1b2> <david.rysk> there wasn't a run with full ccache utilization yet because I was fixing missing headers

20:02 <d1b2> <david.rysk> (pch masks missing #includes)

20:03 <d1b2> <david.rysk> and yeah next big effort is Windows

20:03 <d1b2> <johnsel> I would suggest to change the template work (i.e. ${{ env.SDK_VERSION_REPO }}) because it makes everything un copy-pasteable

20:04 <d1b2> <david.rysk> How are things like env vars handled on selfhosted?

20:04 <d1b2> <david.rysk> and how are external actions?

20:04 <d1b2> <johnsel> from the perspective of the runner everything is the same as on GH

20:04 <d1b2> <johnsel> except the environment will not have all the same packages pre-installed

20:05 <d1b2> <johnsel> that is something that needs to be done by hand

20:05 <d1b2> <johnsel> however that is just a cloud-config

20:05 <d1b2> <david.rysk> yeah that can be worked around, also it is assumed that e.g. Debian will have some different package names than Ubuntu

20:05 <d1b2> <david.rysk> and Fedora and Alpine will have different package managers

20:05 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1201619171495116890/image.png?ex=65ca7a0f&is=65b8050f&hm=ed4871f09fcbbb95e2b897d27435503199dfdbf5c98487a9c332720ba09d85f5&

20:05 <d1b2> <david.rysk> Or have it in if-blocks in the workflow

20:05 <d1b2> <david.rysk> But if you do it on the VM ahead of time, that speeds up build

20:05 <d1b2> <johnsel> yes but try to make it possible to copy from the script

20:06 <d1b2> <johnsel> I always run builds by copying from the github actions if I can

20:06 <d1b2> <johnsel> this way you know everyone gets the same build environment as well

20:06 <d1b2> <johnsel> it's fine splitting out stuff to separate scripts as well

20:07 <d1b2> <johnsel> I wouldn't even put 2 versions of the same OS in one file

20:08 <d1b2> <johnsel> ideally, if you want to keep github runners, we want to just add self-hosted to the existing scripts for those we have an OS image for

20:08 <d1b2> <johnsel> there shouldn't be much management overhead

20:08 <d1b2> <david.rysk> I can add them in once the VMs are working

20:09 <d1b2> <david.rysk> you'll end up with mostly duplicated scripts

20:09 <d1b2> <johnsel> that doesn't matter if it simplifies administrating

20:10 <d1b2> <johnsel> and I can get you a working vm if you want to run scripts against it

20:10 <d1b2> <johnsel> and if andrew blesses it also ssh access

20:10 <d1b2> <david.rysk> I'd like Debian, Fedora, Alpine, if possible

20:10 <d1b2> <johnsel> and access to scopehal-ci with the cloud-config per os

20:10 <d1b2> <johnsel> if you want to change what is pre-installed

20:10 <d1b2> <johnsel> though I'm happy to do that too

20:11 <d1b2> <johnsel> @azonenberg ^

20:11 <d1b2> <johnsel> it's templates time again

20:12 <azonenberg> yeah i can set up ssh access for you if needed. as far as templates go i'll do that in a min

20:12 <azonenberg> Typing one handed with a sandwich in the other

20:14 <d1b2> <david.rysk> don't preinstall anything, I'll put it in the scripts for now

20:15 <d1b2> <david.rysk> then we can preinstall it and it can be a no-op in the scripts

20:16 <d1b2> <johnsel> sure, let me make those changes and remove the ephemeral flag

20:16 <azonenberg> ok template coming right up

20:16 <d1b2> <azonenberg> I dont have the time to learn the ins and outs of how the CI works... just play well together kids 😛

20:16 <d1b2> <david.rysk> can you also make it so that I can use the self-hosted runners from my fork?

20:17 <d1b2> <johnsel> I can allow a PR

20:17 <d1b2> <azonenberg> I'm not sure how it would work with a fork on your own repo. if it was a branch on the main repo, yes

20:17 <d1b2> <johnsel> then all the changes pushed are ran

20:17 <d1b2> <azonenberg> or yes a PR

20:17 <d1b2> <david.rysk> I can PR the changes and put a comment to not merge yet 😛

20:17 <d1b2> <johnsel> PR is easiest

20:17 <d1b2> <david.rysk> but that will spam the crap out of this channel

20:17 <d1b2> <azonenberg> yeah just mark as draft

20:17 <d1b2> <johnsel> that's fine

20:17 <d1b2> <johnsel> we like spam

20:18 <d1b2> <azonenberg> I might want to change the notifier settings for that kind of stuff eventually

20:18 <d1b2> <azonenberg> we are only recently getting enough dev activity that its getting annoying :p

20:18 <d1b2> <johnsel> Yep ideally we get a summary

20:18 <d1b2> <johnsel> but that is probably too much too ask

20:19 <d1b2> <azonenberg> @johnsel which vm did you want templated

20:19 <d1b2> <david.rysk> Any other OSes needed? For versions — currently supported? Do we include bleeding-edge Alpine and Fedora and Debian testing?

20:20 <d1b2> <azonenberg> i dont see anything stopped

20:20 <d1b2> <johnsel> No, david wants to add Alpine and Fedora

20:20 <d1b2> <johnsel> I'm sure you -just- deleted them didn't you?

20:20 <d1b2> <azonenberg> oh

20:20 <d1b2> <azonenberg> lol

20:21 <d1b2> <azonenberg> i thought i only removed old versions of stuff

20:21 <d1b2> <azonenberg> but yeah i am not seeing either

20:21 <d1b2> <johnsel> bhaha

20:21 <d1b2> <johnsel> let me find that script

20:21 <d1b2> <azonenberg> i think something went wrong during the removal

20:21 <d1b2> <azonenberg> because debian jessie is still on there

20:21 <d1b2> <azonenberg> and i dont see alpine

20:21 <d1b2> <azonenberg> lol

20:23 <d1b2> <johnsel> "create-guest-templates"

20:24 <d1b2> <azonenberg> great

20:24 <d1b2> <azonenberg> now we have two of everything

20:25 <d1b2> <johnsel> yay

20:28 <azonenberg> ok this might take some effort to clean up lol

20:28 <d1b2> <david.rysk> whoops!

20:29 <_whitenotifier-3> [scopehal-apps] Johnsel pushed 2 commits to master [+0/-0/±13] https://github.com/ngscopeclient/scopehal-apps/compare/3fa026f6f92f...9fcea6b730eb

20:29 <_whitenotifier-e> [scopehal-apps] Johnsel f39516f - CI: disable vm cycling

20:29 <_whitenotifier-3> [scopehal-apps] Johnsel 9fcea6b - Merge branch 'master' of https://github.com/ngscopeclient/scopehal-apps

20:29 <d1b2> <johnsel> I'm doing a clean upgrade for debian to 12 while we're at it

20:31 <d1b2> <johnsel> @david.rysk mind if I leave a few packages preinstalled on it?

20:31 <d1b2> <johnsel> I can more easily revert to that template w/ gpu

20:31 <d1b2> <johnsel> the others don't have the GPU attached

20:31 <d1b2> <david.rysk> @johnsel sure go ahead, the only reason not having too much preinstalled would be to ensure we're testing the required package list in the docs

20:31 <d1b2> <johnsel> and as we've established that is a pita

20:32 <d1b2> <johnsel> I appreciate that you're willing to work with me to focus on the self-hosted stuff more

20:32 <d1b2> <david.rysk> well at the moment GH-hosted is working

20:32 <d1b2> <johnsel> I just don't want to end up with a situation where we end up with 2 heavily admin loading ci systems

20:32 <d1b2> <david.rysk> I need to start working on include-what-you-use

20:32 <d1b2> <david.rysk> and windows

20:32 <d1b2> <david.rysk> I haven't touched windows yet

20:32 <d1b2> <david.rysk> other than doing an early test of the CMake changes a while ago

20:33 <d1b2> <johnsel> Yep that will be fun

20:36 <d1b2> <johnsel> pm me your ssh key

20:36 <d1b2> <johnsel> I'll set up ssh via ngrok so you can access it

20:36 <d1b2> <azonenberg> (I'm getting him on the VPN soonish)

20:36 <d1b2> <johnsel> yep

20:40 <d1b2> <johnsel> if you want that is

20:41 <d1b2> <david.rysk> I can wait for the VPN setup

20:42 <azonenberg> (restarting xoa for a sec to fix something)

20:49 <azonenberg> So i dont think we ever had alpine or fedora templates

20:49 <azonenberg> we had *alma* linux

20:49 <azonenberg> whatever that is

20:49 <d1b2> <david.rysk> AlmaLinux was one of the two CentOS replacements

20:50 <d1b2> <johnsel> oh then you need to fetch them somewhere

20:50 <d1b2> <david.rysk> the other being Rocky Linux

20:50 <d1b2> <johnsel> there is another something in xoa where you can download the templates

20:50 <azonenberg> We also now have 86 templates installed including a ton of duplicates

20:50 <d1b2> <johnsel> sorry that was probably the right thing to do from the start

20:50 <azonenberg> and i cant seem to delete them

20:50 <azonenberg> lol

20:50 <d1b2> <johnsel> great

20:50 <d1b2> <johnsel> I love xoa more and more every day

20:50 <azonenberg> the hub shows alpine 3.10 and rocky 8.5

20:50 <azonenberg> do you want one or both?

20:51 <d1b2> <johnsel> alpina and fedora and ubuntu I think

20:51 <d1b2> <david.rysk> Sooo rocky: how do we want to treat RHEL?

20:51 <azonenberg> There is no fedora prebuilt template any where i can find

20:51 <d1b2> <david.rysk> Rocky 8.5 == RHEL 8

20:51 <azonenberg> we may have to install ourselves from an iso and template that

20:51 <azonenberg> I dont have a specific list of targets. I know really ancient RHEL is probably not worth the effort to support

20:51 <d1b2> <david.rysk> would it be less complex to just have one VM OS and use docker containers?

20:51 <azonenberg> generally debian stable has been as old as i've targeted

20:52 <d1b2> <david.rysk> I haven't looked at how GH selfhosted works under the hood

20:52 <d1b2> <johnsel> if you want to help write the dockerfiles, that's definitely an option

20:52 <d1b2> <david.rysk> the Dockerfiles would only have to bring it to the state needed for the GH runner, right?

20:52 <d1b2> <david.rysk> right now I'm chasing bugs in glslang

20:52 <d1b2> <johnsel> correct

20:52 <d1b2> <johnsel> actually

20:53 <d1b2> <johnsel> there is an existing software for docker + self-hosted gh

20:53 <d1b2> <david.rysk> That sounds like a relatively simple Dockerfile tbh 🙂

20:53 <d1b2> <johnsel> (which are just vms)

20:53 <d1b2> <johnsel> yes there's some difficulty with the gpu usage

20:53 <d1b2> <johnsel> same as with the normal vm

20:53 <d1b2> <johnsel> it's no problem getting the pcie device in

20:53 <d1b2> <david.rysk> so about that

20:53 <d1b2> <david.rysk> I fixed the tests to not require an X server

20:53 <d1b2> <johnsel> but getting a window session set up and attached

20:54 <d1b2> <johnsel> (and eventually driven)

20:54 <d1b2> <david.rysk> I think the PR for that got merged

20:54 <d1b2> <david.rysk> https://github.com/ngscopeclient/scopehal-apps/pull/677

20:54 <d1b2> <johnsel> that's good, but we want to drive ngscopeclient with an instrument eventually

20:54 <d1b2> <david.rysk> true, but do you need a window for that?

20:54 <d1b2> <johnsel> and automated testing of the gui

20:55 <d1b2> <johnsel> I don't think it's that difficult setting it up w/ the existing GPUs

20:55 <d1b2> <david.rysk> installing Wayland and all that crap is just tedious

20:55 <d1b2> <johnsel> so far I've mostly been hindered by the (old) templates

20:55 <d1b2> <johnsel> well you do it once and it's done

20:56 <d1b2> <johnsel> I'm happy to work on it

20:56 <d1b2> <johnsel> it's probably the shortest route to automated testing for a while

20:56 <d1b2> <johnsel> and lots of things are dependent on the GUI to be instantiated

20:56 <d1b2> <johnsel> I'm more used to running vsphere environments

20:57 <d1b2> <johnsel> they are a little nicer in this regard, but I'm sure it's not that big of a problem

20:57 <d1b2> <johnsel> we have hdmi spoofers so it's just a matter of getting a framebuffer attached

20:57 <d1b2> <david.rysk> you probably don't even need an hdmi spoofer

20:57 <d1b2> <david.rysk> I had X running on my headless box without one

20:57 <d1b2> <david.rysk> and was doing remote sessions

20:58 <d1b2> <johnsel> it depends on the card

20:58 <d1b2> <johnsel> (and motherboard as well)

20:58 <d1b2> <johnsel> it's also much easier to set up a vnc session remotely than a true headless one

20:58 <d1b2> <johnsel> unless you don't want to use the gpu, then it's very easy

20:59 <d1b2> <johnsel> they (nvidia) like it this way, makes people but their vgpu software

21:00 <d1b2> <johnsel> it's also partially because I do everything in cloud-config

21:00 <d1b2> <louis8374> Sorry, client meeting

21:01 <d1b2> <johnsel> that is ran as root on kernel level

21:01 bvernoux has quit [Quit: Leaving]

21:06 <d1b2> <johnsel> interesting

21:06 <d1b2> <johnsel> the windows box was able to grab a gpu but the linux boxes can't

21:07 <d1b2> <johnsel> I did see an error 7f fly by

21:07 <azonenberg> is it possible they are all trying to use the same gpu?

21:07 <d1b2> <david.rysk> no I was using, uh, what's that thing that makes persistent sessions like screen

21:07 <azonenberg> tmux?

21:07 <d1b2> <david.rysk> no for X

21:07 <azonenberg> xvfb?

21:07 <d1b2> <david.rysk> yeah, that

21:07 <azonenberg> thats just an offscreen x server

21:07 <d1b2> <johnsel> right that's not using the hardware

21:07 <azonenberg> i dont think it uses a gpu at all

21:08 <d1b2> <david.rysk> it might not have been that... heh

21:08 <azonenberg> fully software rendered

21:08 <d1b2> <johnsel> yeah I'm probably looking for the same thing

21:08 <d1b2> <david.rysk> no, Xpra. But I think it was using Xvfb

21:08 <d1b2> <david.rysk> but IIRC it doesn't have to

21:08 <d1b2> <johnsel> I could render to my windows machine via it's xserver lol

21:08 <d1b2> <johnsel> before the current regression

21:09 <d1b2> <johnsel> it is, or you might have one by accident?

21:09 <d1b2> <johnsel> or you need to remove then reattach it perhaps

21:09 <d1b2> <johnsel> if you have a moment: https://xoa2.lab.poulsbo.antikernel.net/#/vms/8f8091b9-3090-65ec-f9fb-976e85efb397/advanced

21:09 <d1b2> <johnsel> I don't see the templates showing up btw

21:09 <d1b2> <johnsel> I can set up Windows though

21:10 <d1b2> <johnsel> funny enough that works out of the box after installing the driver

21:10 <azonenberg> meanwhile i see 86 templates lol

21:10 <azonenberg> you want that one tempalted?

21:10 <d1b2> <johnsel> No, reattach the GPU to that one please

21:10 <azonenberg> it shows attached now

21:10 <azonenberg> let me remove and re-add

21:10 <d1b2> <johnsel> and please give me permissions on the alpine/ubuntu/centos(?) templates

21:10 <d1b2> <david.rysk> oh gonna see if ccache is actually helping now 🙂

21:11 <azonenberg> centos stream 9?

21:11 <d1b2> <johnsel> sure unless @david.rysk disagrees

21:11 <azonenberg> and ubuntu 22.04?

21:11 <d1b2> <david.rysk> is 9 popular with EDA?

21:11 <azonenberg> no idea

21:12 <d1b2> <david.rysk> I'm testing on ubuntu 20.04 and 22.04 on GH-hosted

21:12 <d1b2> <johnsel> pro eda runs 8 afaik

21:12 <d1b2> <david.rysk> so I'd say both to duplicate those, they'll be easiest to get up and running?

21:12 <azonenberg> anyway i added those two

21:12 <d1b2> <david.rysk> so probably Rocky 8 then

21:12 <d1b2> <johnsel> yeah just add them they're free anyay

21:12 <d1b2> <david.rysk> @johnsel does pro use Stream, or RHEL?

21:12 <d1b2> <johnsel> and we enough for everone

21:12 <d1b2> <johnsel> rhel

21:12 <d1b2> <david.rysk> yeah then do Rocky 8

21:12 <azonenberg> 8 not 9? ok

21:12 <d1b2> <david.rysk> EDA tends to be behind 😛

21:12 <d1b2> <johnsel> e.g.

21:12 <d1b2> <johnsel> https://www.cadence.com/en_US/home/support/computing-platform-support.html

21:13 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1201636205385760819/image.png?ex=65ca89ec&is=65b814ec&hm=0d62ad14761acd64c48dd7d2fdf5a09391d5f893394a299d9c52d4c9ebd65885&

21:13 <d1b2> <david.rysk> yeah Rocky will give us closest experience to RHEL 8+ without paying for licenses

21:13 <d1b2> <johnsel> agreed

21:15 <d1b2> <david.rysk> Do we want to test SLES?

21:16 <azonenberg> not aware of any users on it

21:16 <azonenberg> but cant hurt i guess

21:17 <d1b2> <johnsel> hmmm

21:18 <d1b2> <johnsel> I could put the installation of msys2 and nvidia drivers and whatever else it might need in the github action runner script

21:18 <d1b2> <johnsel> maybe drop a file in the filesystem to flag it as "already installed" and skip after the first time

21:19 <d1b2> <johnsel> that would help with repeatability if we're not going to template these suckers

21:20 <d1b2> <johnsel> WHAT

21:20 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1201637899217674310/image.png?ex=65ca8b80&is=65b81680&hm=80d0ea32a55249b00ab57ee4a7fd804054796f2ceca7cb51c5bf63f709a7521b&

21:20 <d1b2> <johnsel> oh no

21:20 <d1b2> <johnsel> the unknown pci device isn't a graphics card

21:20 <d1b2> <johnsel> none of the VMs are grabbing GPUs

21:20 <d1b2> <azonenberg> wait what

21:21 <d1b2> <johnsel> that's so strange

21:21 <d1b2> <johnsel> I hate xoa

21:21 <d1b2> <johnsel> this is such a headache

21:21 <d1b2> <johnsel> we had this working in december

21:21 <d1b2> <johnsel> both vm types had a GPU attached

21:22 <d1b2> <david.rysk> what's the OS here?

21:22 <d1b2> <david.rysk> windows?

21:22 <d1b2> <johnsel> this is win11

21:22 <d1b2> <johnsel> but it seems not to matter

21:22 <d1b2> <david.rysk> what ven/dev are you seeing in device manager and what driver package are you installing?

21:22 <d1b2> <johnsel> none

21:22 <d1b2> <david.rysk> ahh right

21:22 <d1b2> <david.rysk> I see above

21:22 <d1b2> <johnsel> you should know that the ACL doesn't let you give access to the GPU

21:23 <d1b2> <johnsel> so by way of the templates having a GPU the derivates got one as well

21:23 <d1b2> <azonenberg> ok so let me try something

21:23 <d1b2> <azonenberg> i'm going to try command line hotplugging a gpu to that vm and see what happens

21:23 <d1b2> <johnsel> the whole "self managed resources set" and GPUs concepts are basically incompatible

21:23 <d1b2> <johnsel> it's not implemented in XOA

21:24 <d1b2> <johnsel> and then second there is only a passthrough GPU

21:25 <d1b2> <johnsel> although it should be possible to create a GPU group with a pci ven/dev (which also sucks because those are equal for the 2) and assign that to the vm manually

21:25 <d1b2> <azonenberg> Yeah

21:25 <_whitenotifier-e> [scopehal] d235j synchronize pull request #841: Refactor of Cmake scripts - https://github.com/ngscopeclient/scopehal/pull/841

21:25 <d1b2> <johnsel> maybe try https://docs.xcp-ng.org/compute/#2-tell-xcp-ng-not-to-use-this-device-id-for-dom0

21:26 <_whitenotifier-e> [scopehal-apps] d235j synchronize pull request #673: Cmake cleanups - https://github.com/ngscopeclient/scopehal-apps/pull/673

21:26 <d1b2> <johnsel> hmm maybe follow that from start to end

21:27 <d1b2> <johnsel> since it uses device id now instead of ven/dev

21:28 <d1b2> <johnsel> damn it does require a reboot

21:28 <d1b2> <johnsel> so clunky

21:30 <d1b2> <azonenberg> yeah i'm not sure how to tell if dom0 is grabbing the vm

21:30 <d1b2> <johnsel> /s/vm/gpu I think but yeah

21:30 <d1b2> <johnsel> you can try without that step just attaching it

21:30 <d1b2> <johnsel> do you have a GPU on your instance though?

21:30 <d1b2> <johnsel> I'd assume you do

21:30 <d1b2> <azonenberg> i'm trying to figure out how to tell if its on dom0

21:31 <d1b2> <johnsel> dmesg

21:31 <d1b2> <johnsel> or lspci -vv

21:31 <d1b2> <johnsel> should have a driver attached if it has it

21:31 <d1b2> <azonenberg> pciback

21:31 <d1b2> <azonenberg> which i think means passthrough mode

21:32 <d1b2> <johnsel> yes

21:32 <d1b2> <johnsel> xl pci-assignable-list

21:33 <_whitenotifier-3> [scopehal-apps] d235j opened pull request #678: Add missing includes masked by PCH - https://github.com/ngscopeclient/scopehal-apps/pull/678

21:33 <azonenberg> [13:31 sanquentin azonenberg]# xl pci-assignable-list 0000:51:00.1 0000:18:00.0 0000:51:00.0 0000:18:00.1

21:33 <azonenberg> thats the two GPUs and the audio devices attached to them (for hdmi sound out)

21:33 <d1b2> <johnsel> ok good so they just aren't attached to the vms then

21:34 <d1b2> <johnsel> azonenberg: need me to write the commands out for you or are you good?

21:36 <azonenberg> list of commands to test would be good. but i did try some obvious stuff and did not have much luck

21:36 <d1b2> <johnsel> sure sec

21:37 <d1b2> <johnsel> xe vm-param-set other-config:pci=0/0000:51:00.1 uuid=8f8091b9-3090-65ec-f9fb-976e85efb397 xe vm-param-set other-config:pci=0/0000:18:00.0 uuid=8f8091b9-3090-65ec-f9fb-976e85efb397

21:37 <d1b2> <johnsel> xe vm-param-set other-config:pci=0/0000:51:00.0 uuid=5476490c-a7cc-1829-49bc-3c4e863cb3a4 xe vm-param-set other-config:pci=0/0000:51:00.1 uuid=5476490c-a7cc-1829-49bc-3c4e863cb3a4

21:37 <d1b2> <johnsel> that ought to do it

21:38 <azonenberg> Done

21:38 <azonenberg> (also only the .0 is necessary .1 is the audio we dont care about)

21:38 <d1b2> <johnsel> at this point I don't want it to get confused

21:41 <d1b2> <johnsel> nope for windows

21:44 <_whitenotifier-3> [scopehal-apps] azonenberg closed pull request #678: Add missing includes masked by PCH - https://github.com/ngscopeclient/scopehal-apps/pull/678

21:44 <_whitenotifier-e> [scopehal-apps] azonenberg 8042843 - Merge pull request #678 from d235j/add-missing-includes Add missing includes masked by PCH

21:44 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 2 commits to master [+0/-0/±4] https://github.com/ngscopeclient/scopehal-apps/compare/9fcea6b730eb...80428430fe8b

21:44 <d1b2> <johnsel> and the linux vm won't start

21:44 <d1b2> <johnsel> great

21:44 <azonenberg> lol

21:45 <azonenberg> There is always the option of falling back to no-GPU and just using lavapipe :p

21:45 <d1b2> <david.rysk> for now that's enough 😄

21:45 <azonenberg> It'll be a lot slower and less realistic but is better than being broken

21:45 <d1b2> <david.rysk> not that much less realistic, it's fully 1.3 compliant

21:46 <d1b2> <johnsel> I mean at this point we might have to

21:46 <d1b2> <johnsel> I have no idea why it's not working anymore

21:47 <d1b2> <azonenberg> yeah but i mean in terms of testing corner cases of how particular GPUs act

21:47 <d1b2> <azonenberg> things like the different block size limit on intel for example

21:47 <d1b2> <johnsel> yeah we had a lot of implementation related issues with vulkan

21:48 <d1b2> <azonenberg> ideally we would have multiple real gpus from different vendors in the CI. but that might be too mcuh trouble

21:48 <_whitenotifier-e> [scopehal-apps] d235j synchronize pull request #673: Cmake cleanups - https://github.com/ngscopeclient/scopehal-apps/pull/673

21:48 <d1b2> <johnsel> you never answered my question though azonenberg, do you have a working gpu attached to your vm?

21:49 <d1b2> <azonenberg> which vm

21:49 <d1b2> <johnsel> any vm

21:49 <d1b2> <azonenberg> not to my knowledge at the moment, no

21:49 <d1b2> <azonenberg> i did at one point on a different one for testing but that was last year

21:49 <d1b2> <johnsel> another strange issue with xoa

21:50 <d1b2> <johnsel> on the one hand it's a nice project on the other hand pfff

21:50 <d1b2> <johnsel> what kind of bugs are this

21:50 <d1b2> <johnsel> ghost resources

21:50 <d1b2> <johnsel> in 2 ways

21:50 <d1b2> <azonenberg> i mean IME with hypervisors in generawl passthrough is here be dragons territory

21:50 <d1b2> <azonenberg> the ghost usage is a different story

21:50 <_whitenotifier-3> [scopehal] d235j opened pull request #848: Add missing includes masked by PCH (locale_t) - https://github.com/ngscopeclient/scopehal/pull/848

21:50 <d1b2> <johnsel> it depends

21:50 <d1b2> <johnsel> vsphere can do it fine

21:51 <d1b2> <azonenberg> yeah and what does vsphere cost these days :p

21:51 <d1b2> <johnsel> windows' works perfectly

21:51 <d1b2> <johnsel> esxi has a free variant

21:51 <d1b2> <johnsel> though you may have too many cores for it

21:51 <d1b2> <azonenberg> i stopped using esxi back when it was an 8gb ram limit lol

21:51 <d1b2> <johnsel> not to mention 6 months ago our xen worked fine as well

21:52 <_whitenotifier-e> [scopehal] d235j synchronize pull request #848: Add missing includes masked by PCH (locale_t) - https://github.com/ngscopeclient/scopehal/pull/848

21:52 <d1b2> <johnsel> unraid and proxmox can do it too

21:52 <d1b2> <johnsel> pcie passthrough has been solved a long time ago on most platforms

21:52 <d1b2> <david.rysk> I've been recently recommended proxmox 😛

21:53 <d1b2> <johnsel> yeah xen is lagging behind

21:53 <d1b2> <david.rysk> I have worked with Nutanix before ($$$$$) and Proxmox looks like a nice FOSS alternative

21:53 <d1b2> <johnsel> openstack makes it a little better it seems but far too much work to admin

21:54 <d1b2> <246tnt> I mean, I was using pci-passthru on zen back in 2017 just fine 😅 Never use XOA though, it was just bare xen.

21:54 <d1b2> <azonenberg> yeah i was using native xl commands on this box for a long time

21:54 <d1b2> <johnsel> yeah honestly this -should- be working

21:54 <d1b2> <azonenberg> switched to xcp-ng and xoa to get more automation support for CI

21:54 <d1b2> <johnsel> hey

21:54 <d1b2> <johnsel> I just realized something

21:54 <d1b2> <azonenberg> vs running raw "xl create" commands in the CLI

21:54 <d1b2> <azonenberg> ?

21:55 <d1b2> <johnsel> the ghost usage problems might be related

21:55 <d1b2> <johnsel> if it's still eating CPU and RAM

21:55 <d1b2> <johnsel> very likely it's also eating GPUs

21:55 <d1b2> <azonenberg> No

21:55 <d1b2> <david.rysk> @azonenberg still waiting for VPN 😛

21:55 <d1b2> <johnsel> yes

21:55 <d1b2> <azonenberg> That was purely self service quotas not accountog for templatizing

21:55 <d1b2> <david.rysk> but you seem busy so it's ok 🙂

21:55 <d1b2> <azonenberg> in particular, the ram was not actually used

21:55 <d1b2> <johnsel> sure but it was counting and refusing to associate/create new resources

21:56 <d1b2> <johnsel> if these gpus are marked as "in-use" by the same mechanism

21:56 <d1b2> <azonenberg> That mechanism is in the self service pool which doesnt know what a gpu is or have any limits for it

21:56 <d1b2> <azonenberg> very unilkely to be related

21:56 <d1b2> <246tnt> does live attaching them (with xl) to the running vm work ?

21:57 <d1b2> <johnsel> the resource limits are just a sum over the amount of currently in use resources from the database

21:57 <d1b2> <azonenberg> [13:57 sanquentin azonenberg]# xl pci-attach 5 0000:51:00.0 libxl: error: libxl_pci.c:1567:libxl__device_pci_add: Domain 5:PCI device 0000:51:00.0 already assigned to a different guest? libxl: error: libxl_pci.c:1727:device_pci_add_done: Domain 5:libxl__device_pci_add failed for PCI device 0:51:0.0 (rc -1) libxl: error: libxl_device.c:1408:device_addrm_aocomplete: unable to add device

21:57 <d1b2> <azonenberg> well thats interesting

21:57 <d1b2> <johnsel> see

21:57 <d1b2> <johnsel> though that could still be from earlier

21:57 <d1b2> <johnsel> what's 5 ?

21:57 <d1b2> <azonenberg> one of my random vms i had open

21:58 <d1b2> <azonenberg> i wanted sometehing disconnected from your templates

21:58 <d1b2> <azonenberg> (this is the xl domain IDs not the xe uuids)

21:58 <d1b2> <johnsel> well you might need to disassociate it still

21:58 <d1b2> <azonenberg> is there any command to see what a gpu is attached to?

21:58 <d1b2> <azonenberg> (my suspicion is the card is in a bad state and we need to reset the host)

21:59 <d1b2> <johnsel> you can reset the pci endpoint via sys btw

21:59 <d1b2> <246tnt> you can unbind the pciback driver from it and re-bind it ?

21:59 <d1b2> <johnsel> also not a bad idea

22:00 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 4 commits to master [+0/-0/±6] https://github.com/ngscopeclient/scopehal-apps/compare/80428430fe8b...9c777d16c9b5

22:00 <_whitenotifier-e> [scopehal-apps] azonenberg f1c607c - Fixed logic checking for all density functions vs eye patterns being swapped, leading to spectrograms and other non-eye density plots not being zoomable in the Y axis

22:00 <_whitenotifier-3> [scopehal-apps] azonenberg 4ebb1bc - Updated submodules

22:00 <_whitenotifier-e> [scopehal-apps] azonenberg 9ad7ddf - Merge branch 'master' of github.com:ngscopeclient/scopehal-apps

22:00 <_whitenotifier-3> [scopehal-apps] azonenberg 9c777d1 - Fix scrolling for culled packets

22:00 <d1b2> <johnsel> though might mess up more state in the xen backend

22:00 <d1b2> <azonenberg> I can try a host reboot after work

22:00 <d1b2> <johnsel> xe vgpu-list is the closest I've come so far

22:02 <d1b2> <johnsel> pci-assignable-list will list which devices are available to be assigned to guests. This will list all devices currently assigned to pciback, whether this was done by pci-assignable-add, or by the two methods mentioned in the previous section (linux command-line or manual sysfs commands).

22:02 <_whitenotifier-3> [scopehal-apps] azonenberg commented on issue #442: Figure out how to handle markers in filter-only sessions with no history - https://github.com/ngscopeclient/scopehal-apps/issues/442#issuecomment-1915653286

22:02 <d1b2> <johnsel> returning the device to dom0 as simple as: # xl pci-detach [domid] 07:00.0 # xl pci-assignable-remove -r 07:00.0

22:05 <d1b2> <johnsel> /var/log/xen/hypervisor.log might have something useful too

22:05 <d1b2> <johnsel> I'll leave it for you for now

22:05 <d1b2> <azonenberg> xl pci-assignable-remove is taking a long time

22:05 <d1b2> <azonenberg> like a minute or more

22:05 <d1b2> <johnsel> I'm getting a headache from xoa

22:05 <d1b2> <azonenberg> but does eventually complete

22:06 <d1b2> <johnsel> that might be something

22:06 <d1b2> <johnsel> ornot

22:06 <d1b2> <azonenberg> https://cdn.discordapp.com/attachments/776941750291267595/1201649693696995448/errors.png?ex=65ca967c&is=65b8217c&hm=2a70b60d760d0ac0e7b34c58207c77c1f506ac2eff87534131966d1c21f6e610&

22:07 <d1b2> <david.rysk> I don't like that

22:07 <d1b2> <azonenberg> at this point i want to say lets just drop the gpus and run lavapipe

22:07 <d1b2> <azonenberg> a few months ago we didnt have ngscopeclient working under any software vulkan implementation

22:07 <d1b2> <azonenberg> so the gpu was mandatory

22:07 <d1b2> <david.rysk> we can figure out the GPUs later

22:07 <d1b2> <azonenberg> its now a nice to have for speed

22:09 <d1b2> <johnsel> I think a reboot is worth trying but yeah after that let's leave it for now and button up the rest

22:11 <d1b2> <246tnt> Does it run on llvmpipe now ?

22:11 <d1b2> <david.rysk> @246tnt it's running on lavapipe in Github-hosted CI

22:12 <d1b2> <david.rysk> well, the tests are

22:12 <d1b2> <david.rysk> I haven't tested with full GUI

22:12 <d1b2> <david.rysk> but I don't see why it wouldn't run

22:12 <d1b2> <246tnt> is lavapipe != llvmpipe ?

22:12 <d1b2> <david.rysk> Lavapipe is the Vulkan part of llvmpipe I think

22:12 <d1b2> <johnsel> afaik it's the same

22:12 <d1b2> <johnsel> except lavapipe is pronounciated

22:13 <d1b2> <246tnt> Mmm, gui definitely doesn't work for me if I use llvmpipe vulkan at least.

22:13 <d1b2> <david.rysk> I'll test over here...

22:14 <d1b2> <johnsel> Yeah I'd expect a whole lot of new issues with llvmpipe

22:14 <d1b2> <johnsel> david did you fix ehh

22:14 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1201651610921746533/image.png?ex=65ca9845&is=65b82345&hm=c6eada23174936dc4bf156d7be3da68f8d1747ecc081eba2931409c1343e64cd&

22:14 <d1b2> <david.rysk> @johnsel you don't have an X session I think?

22:14 <d1b2> <johnsel> or is that not fixed/encountered?

22:15 <d1b2> <johnsel> possible, it's in my notes from previous testing

22:15 <d1b2> <david.rysk> I have this ThinkPad X230 here for testing

22:16 <d1b2> <david.rysk> its GPU barely supports Vulkan

22:16 <d1b2> <246tnt> yeah, vkcube works with llvmpipe for me. And ngscope sort of does, but it hangs when trying to render a waveform.

22:17 <d1b2> <david.rysk> yeah that might be something to debug

22:17 <d1b2> <246tnt> lol ... and a quick test shows it's related to the g_done stuff too 😅

22:18 <d1b2> <david.rysk> 🤣

22:21 <_whitenotifier-3> [scopehal] azonenberg closed pull request #848: Add missing includes masked by PCH (locale_t) - https://github.com/ngscopeclient/scopehal/pull/848

22:21 <_whitenotifier-e> [scopehal] azonenberg pushed 2 commits to master [+0/-0/±2] https://github.com/ngscopeclient/scopehal/compare/e849e306c68f...35e1d85c30f8

22:21 <_whitenotifier-3> [scopehal] azonenberg 35e1d85 - Merge pull request #848 from d235j/add-missing-includes-2 Add missing includes masked by PCH (locale_t)

22:25 <d1b2> <johnsel> sighs

22:34 <_whitenotifier-e> [scopehal] d235j synchronize pull request #841: Refactor of Cmake scripts - https://github.com/ngscopeclient/scopehal/pull/841

22:40 <d1b2> <david.rysk> I just squashed a lot of stuff

22:43 <d1b2> <david.rysk> @johnsel once the xoa-side stuff is figured out I definitely can work on fixing up workflows for the selfhosted 🙂

22:44 <d1b2> <david.rysk> right now I'll continue working on what works

22:44 <d1b2> <johnsel> yep we'll see if the GPUs come back after a reboot

22:44 <d1b2> <david.rysk> @246tnt does llvmpipe/lavapipe work when you add that patch?

22:44 <d1b2> <johnsel> otherwise I just need to set up the runners

22:44 <d1b2> <johnsel> but I am too tired now

22:45 <d1b2> <johnsel> I'll do it soon, maybe tomorrow or the day after

22:45 <d1b2> <david.rysk> that's fine

22:46 <d1b2> <246tnt> @david.rysk What patch ?

22:46 <d1b2> <david.rysk> The one that adds barriers

22:46 <d1b2> <246tnt> No, it's merged already and it doesn't fix the issue for llvmpipe.

22:47 <d1b2> <azonenberg> 😦

22:47 <d1b2> <246tnt> But I forced the break out of the loop (manually forcing done to true) and that made it render. It's not a "correct" render (since it only goes through one iteration), but that points to me that the issue is related to done again.

23:02 <d1b2> <david.rysk> can you tell where it gets stuck?

23:04 <d1b2> <246tnt> No I'm not sure why it's not stopping. I don't think it's a shared mem issue because even doing the done signal distribution in the most conservative way possible, it's not triggering, so something else might be wrong.

23:06 <d1b2> <david.rysk> are you allowed to have a barrier() inside that if-statement?

23:06 <d1b2> <david.rysk> or wait, it's not inside, just alongside

23:13 <_whitenotifier-3> [scopehal-apps] d235j synchronize pull request #673: Cmake cleanups - https://github.com/ngscopeclient/scopehal-apps/pull/673

23:50 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 1 commit to master [+0/-0/±1] https://github.com/ngscopeclient/scopehal-apps/compare/9c777d16c9b5...9e2e23e1ab43

23:50 <_whitenotifier-e> [scopehal-apps] azonenberg 9e2e23e - ProtocolAnalyzerDialog: avoid crash if no packets in decode output

23:50 <d1b2> <johnsel> @azonenberg can you send a screenshot of the system overview of xoa? I'm talking in #xcp-ng about the issue

23:51 <d1b2> <azonenberg> which page in particular?

23:52 <d1b2> <johnsel> I don't know, the one with most relevant info (cpu type etc)

23:52 <d1b2> <johnsel> whatever you deem relevant for the question "what system?"

23:52 <d1b2> <johnsel> sorry I can't be more helpful, I don't know what pages xoa has for this

23:52 <d1b2> <azonenberg> https://cdn.discordapp.com/attachments/776941750291267595/1201676394254311424/xcp1.png?ex=65caaf5a&is=65b83a5a&hm=f16b26cfdc5b20d4167967cd80561e41f1142cc0e787aea78877712693ca9994&

23:53 <d1b2> <azonenberg> https://cdn.discordapp.com/attachments/776941750291267595/1201676414399549500/xcp2.png?ex=65caaf5f&is=65b83a5f&hm=ec90e9adadfe95541d0a509029a5f511f62ece8331884c86542cafc7dcd8bad0&

23:53 <d1b2> <johnsel> thank you