#scopehal on 2024-01-17 — irc logs at libera.irclog.whitequark.org

2023-10-21 05:40 azonenberg changed the topic of #scopehal to: ngscopeclient, libscopehal, and libscopeprotocols development and testing | https://github.com/ngscopeclient/scopehal-apps | Logs: https://libera.irclog.whitequark.org/scopehal

01:27 coralreef has quit [Quit: Do not go gentle into that goodnight.]

02:02 Degi_ has joined #scopehal

02:02 Degi has quit [Ping timeout: 255 seconds]

02:02 Degi_ is now known as Degi

11:58 <_whitenotifier-3> [scopehal-apps] dizzystem opened pull request #670: more icons + tweaks to step and area under curve - https://github.com/ngscopeclient/scopehal-apps/pull/670

16:30 <_whitenotifier-3> [scopehal-apps] azonenberg closed pull request #670: more icons + tweaks to step and area under curve - https://github.com/ngscopeclient/scopehal-apps/pull/670

16:30 <_whitenotifier> [scopehal-apps] dizzystem a36b13d - more icons + tweaks to step and area under curve

16:30 <_whitenotifier-3> [scopehal-apps] azonenberg pushed 2 commits to master [+48/-0/±10] https://github.com/ngscopeclient/scopehal-apps/compare/70a56ae75aea...6c94d67e2621

16:30 <_whitenotifier-3> [scopehal-apps] azonenberg 6c94d67 - Merge pull request #670 from dizzystem/icons more icons + tweaks to step and area under curve

18:17 josuah has joined #scopehal

18:18 josuah has quit [Client Quit]

18:19 josuah has joined #scopehal

20:28 <d1b2> <johnsel> @azonenberg do you have a template for a ngscopeclient driver plugin?

20:32 <d1b2> <azonenberg> No

20:33 <d1b2> <azonenberg> You should be able to figure it out, basically there's an extern "C" void PluginInit() that gets called when the plugin is loaded where you register your driver just like if it was in scopehal proper

20:33 <d1b2> <azonenberg> then you write the driver class just like normal

20:34 <d1b2> <johnsel> alright

20:34 <d1b2> <johnsel> it looks like the CI is still failing

20:34 <d1b2> <johnsel> weird

20:35 <d1b2> <johnsel> ╷ │ Error: timeout while waiting for state to become 'Running' (last state: 'Halted', timeout: 5m0s) │ │ with xenorchestra_vm.builder_gpu_39, │ on linux.tf line 1, in resource "xenorchestra_vm" "builder_gpu_39": │ 1: resource "xenorchestra_vm" "builder_gpu_39" { │ ╵ Error: Process completed with exit code 1.

20:35 <d1b2> <johnsel> it created the VM but for some reason it did not start

20:36 <d1b2> <johnsel> could it be asking for too much resources?

20:40 <d1b2> <azonenberg> Dont know. it's been doing that for a week or two, i've been too busy to remember to bug you

20:40 <d1b2> <johnsel> Yeah, weird

20:41 <d1b2> <azonenberg> scopehal-ci-set right now is using 90/128 vCPUs, 68/128 GB RAM, 694 GB / 2TB storage allocated

20:41 <d1b2> <johnsel> I've reduced the requested resources but it may be a bug somewhere

20:42 <d1b2> <azonenberg> ok well please investigate and let me know if any action is needed on my side

20:43 <d1b2> <johnsel> yep

20:45 <d1b2> <johnsel> do you have something to push?

20:49 <d1b2> <johnsel> ah no need we still have some tasks pending

20:49 <d1b2> <johnsel> or I could restart one of those I guess

20:49 <d1b2> <johnsel> it's weird because the script works fine if I trigger it from a ssh session

20:50 <d1b2> <johnsel> but somehow when the CI triggers it it doesn't work properly

20:52 <d1b2> <johnsel> hmm it may be as simple as the resources not being available because it does both delete + create actions at the same time

20:52 <d1b2> <johnsel> xenorchestra_vm.builder_gpu_38: Destroying... [id=c4b61988-51a3-89ed-0710-47f0e746a216] xenorchestra_vm.builder_gpu_39: Creating... xenorchestra_vm.builder_gpu_38: Still destroying... [id=c4b61988-51a3-89ed-0710-47f0e746a216, 10s elapsed] xenorchestra_vm.builder_gpu_39: Still creating... [10s elapsed] xenorchestra_vm.builder_gpu_38: Still destroying... [id=c4b61988-51a3-89ed-0710-47f0e746a216, 20s elapsed] xenorchestra_vm.builder_gpu_39: Still

20:52 <d1b2> creating... [20s elapsed] xenorchestra_vm.builder_gpu_38: Destruction complete after 24s xenorchestra_vm.builder_gpu_39: Still creating... [30s elapsed]

20:52 <d1b2> <johnsel> so the creation might succeed before the destruction has succeeded which means there aren't any resources available

20:52 <d1b2> <azonenberg> oh interesting

20:53 <d1b2> <azonenberg> yeah so you just need to wait for the destruction to complete before creating?

20:53 <d1b2> <johnsel> it would seem so

20:53 <d1b2> <johnsel> I will have to see how I can do this

20:53 <d1b2> <johnsel> I hope it doesn't require individual scripts because that would suck

20:53 <d1b2> <johnsel> well, it wouldn't be horrible but would be extra work

20:55 <d1b2> <johnsel> or you could increase the resources ofcourse

20:56 <d1b2> <azonenberg> How far?

20:56 <d1b2> <azonenberg> more vCPUs is easy, we're oversubscribed plenty

20:56 <d1b2> <azonenberg> RAM is less excessive

20:56 <d1b2> <johnsel> I think it's the RAM that is causing the issue

20:57 <d1b2> <johnsel> I've lowered the RAM requested to 30GB now so it should have headroom for the 2 vms to run at the same time

20:58 <d1b2> <johnsel> hmm I see another bug

20:58 <d1b2> <johnsel> we run other people's code

20:58 <d1b2> <azonenberg> ok yes you're oversubscribed. gimme a minute i can give you a bit more ram

20:58 <d1b2> <johnsel> gotta disable that so it doesn't do that

20:58 <d1b2> <azonenberg> what do you mean?

20:59 <d1b2> <azonenberg> for PRs?

20:59 <d1b2> <johnsel> yep

20:59 <d1b2> <azonenberg> Just bumped your RAM cap to 140G and you should no longer be oversubscribed

20:59 <d1b2> <azonenberg> And we want to be running it for PRs, but only once i've approved that user to do so

20:59 <d1b2> <johnsel> We can't

21:00 <d1b2> <azonenberg> what do you mean? there's a button to approve workflows

21:00 <d1b2> <azonenberg> it won't run any of our CI scripts, even the ones on the github runners, until i approve

21:00 <d1b2> <johnsel> It doesn't allow the PR job to get secrets so it doesn't have credentials

21:00 <d1b2> <johnsel> so it can run the job itself but cycling the VM will fail

21:00 <d1b2> <johnsel> unless we put the credentials inside the vm instead of the job

21:01 <d1b2> <johnsel> but this might cause problems with them showing up in the log output

21:01 <d1b2> <azonenberg> Hmm

21:01 <d1b2> <johnsel> I'd have to look for that latter

21:01 <d1b2> <azonenberg> well i really would like the ability to have trusted PRs run on the CI environment

21:01 <d1b2> <azonenberg> but for now let's fix it so at least stuff pushed to the repo directly CI's right

21:02 <d1b2> <johnsel> I mean if you trust the PR not to cat the .sh file it should be possible to put the credentials there

21:07 <d1b2> <johnsel> your pick @azonenberg

21:08 <d1b2> <azonenberg> So I thought the idea was that there would be an unauthenticated cycle-vm endpoint on a long lived instance that has creds to the xen backend

21:08 <d1b2> <azonenberg> and the runner vm could request that cycle request whenever it wanted without having other xen access

21:08 <d1b2> <azonenberg> is that not how you ended up building it?

21:09 <d1b2> <johnsel> It ended up becoming a long-lived runner instance that runs a script and gets the credentials through GitHub secrets

21:09 <d1b2> <johnsel> This was easier, and until this problem it seemed to suffice for our needs

21:09 <d1b2> <azonenberg> (also how does this relate to the s3 stuff? when i was sick i got halfway through provisioning that, and never finished debugging it)

21:10 <d1b2> <azonenberg> iirc rgw is installed on the nodes but it's not actually running and serving anything and i haven't figured out why

21:10 <d1b2> <johnsel> s3 will hold the terraform state, this will mean the long-lived runner can be ephemeral too as it will no longer have the database in a file on the local filesystem

21:10 <d1b2> <azonenberg> ok so that's an unrelated issue

21:10 <d1b2> <johnsel> it's not directly related to the authentication

21:10 <d1b2> <johnsel> yep

21:11 <d1b2> <johnsel> it's possible to build a web server and trigger that, though it would be some work to set it up

21:11 <d1b2> <azonenberg> i think thats the best option because I want to be able to CI approved PRs

21:11 <d1b2> <johnsel> we can also put the credential in the shell script and make sure nobody changes the ci script to cat that file

21:12 <d1b2> <johnsel> that's a 1 minute solution

21:12 <d1b2> <azonenberg> if you think we can do that securely, go for it. I expect the design will evolve over time

21:12 <d1b2> <azonenberg> i mean, I review PRs before approving them to run

21:12 <d1b2> <azonenberg> and if somebody is messing with the yaml that's gonna catch my eye

21:12 <d1b2> <johnsel> yeah that would be the vector

21:13 <d1b2> <johnsel> and worst case it would mean someone gets access to the xoa instance

21:13 <d1b2> <johnsel> it's not like there is much to get there

21:13 <d1b2> <azonenberg> well more to the point it would mean they'd get access to the xoa with credentials that have CI-level access

21:13 <d1b2> <azonenberg> i.e. they can't touch anything outside that resource set

21:13 <d1b2> <azonenberg> right?

21:13 <d1b2> <johnsel> correct, right now it's my user

21:13 <d1b2> <johnsel> we should introduce a separate one

21:13 <d1b2> <azonenberg> Yeah

21:14 <d1b2> <johnsel> but yes it's just the ci resource set

21:14 <d1b2> <azonenberg> But the point is, it's a limited user that can't touch other instances

21:14 <d1b2> <johnsel> can't delete templates

21:14 <d1b2> <azonenberg> and the worst they can do is things like spawn CI builder VMs or run arbitrary code in the VM they are already running arbitrary code in

21:14 <d1b2> <johnsel> can't even access the vpn network

21:14 <d1b2> <johnsel> right

21:14 <d1b2> <johnsel> they could mine bitcoins

21:14 <d1b2> <azonenberg> Lol. That would get noticed fast when other instances start slowing down

21:15 <d1b2> <johnsel> and we both monitor the xoa/ci anyway so anything weird would likely be noticed quickly

21:15 <d1b2> <azonenberg> So yeah i'm not super concerned about that. What i mostly am concerned about is the potential for malicious unit test binaries running in the VM finding a way to break out and escape the CI network. So we'll definitely want to carefully verify the segmentation

21:15 <d1b2> <azonenberg> which I think is solid but i'd like to get a few more eyes on soon

21:16 <d1b2> <azonenberg> the main line of defense there is just not enabling workflows until i've vetted the PR and the contributor is at least somewhat trusted

21:16 <d1b2> <johnsel> we can test that if you want

21:16 <d1b2> <johnsel> but the basic ping tests we did showed it just had access to the internet

21:16 <d1b2> <azonenberg> Yeah once we get things debugged and operational i'd like you to spend some time trying to break out of the CI

21:16 <d1b2> <johnsel> and the xoa instance

21:16 <d1b2> <azonenberg> i'll probably get lain to hack on it as well if she has time

21:17 <d1b2> <azonenberg> We're both security professionals so if neither of us can find a way to get out it's probably good enough to guard against the unlikely scenario of someone who's already submitted good patches turning rogue

21:18 <d1b2> <johnsel> yep sure we can plan a round of red testing once we have things working properly

21:18 <d1b2> <azonenberg> exactly

21:18 <d1b2> <azonenberg> anyway, first things first you now have more RAM so the resource exhaustion problem should no longer be an issue

21:19 <d1b2> <johnsel> yes I'll return the requested RAM back to 64GB in a bit

21:19 <d1b2> <azonenberg> (that said, we probably don't want to run both builders at one if we can avoid it)

21:19 <d1b2> <azonenberg> at least not with 64G each

21:19 <d1b2> <johnsel> if things work properly it should only overlap seconds to minutes

21:19 <d1b2> <johnsel> https://github.com/ngscopeclient/scopehal-apps/actions/runs/7558801428

21:20 <d1b2> <johnsel> it cycled correctly so it looks like that was the issue

21:20 <d1b2> <johnsel> I'll still investigate if we can ensure the creation happens after the destruction

21:20 <d1b2> <azonenberg> my concern is ooming during those seconds

21:20 <d1b2> <azonenberg> i dont want to have 64GB of ram permanently unusable because i need it for those few seconds to avoid errors

21:20 <d1b2> <azonenberg> (i.e. i can never schedule other vms to use that memory)

21:21 <d1b2> <azonenberg> https://cdn.discordapp.com/attachments/776941750291267595/1197289501924528158/vmstatus.png?ex=65bab9bd&is=65a844bd&hm=56648c8a20eb2639a1f6d15e4043370451ff27cd3d596008084de11569a86895&

21:21 <d1b2> <azonenberg> here's what i have host wide right now

21:21 <d1b2> <azonenberg> the three right hand blobs on the memory bar are your instances

21:21 <d1b2> <johnsel> can you not overprovision the RAM?

21:21 <d1b2> <azonenberg> Not to my knowledge, with the current setup

21:21 <d1b2> <azonenberg> i can oversubscribe CPU capacity but not memory

21:22 <d1b2> <azonenberg> anyway, if i have to throw another 128 or 256 gigs in there its not the end of the world but it'll take time and money

21:22 <d1b2> <johnsel> I think we can use dynamic memory management

21:22 <d1b2> <azonenberg> so for now let's make it work with what we have

21:22 <d1b2> <johnsel> yeah no extra hardware shouldn't be necessary

21:22 <d1b2> <johnsel> I'm sure we're not the first running into this issue

21:23 <d1b2> <azonenberg> anyway for the short term i dont have any other memory hungry instances planned to run there

21:24 <d1b2> <azonenberg> so let's just make it functional asap

21:25 <d1b2> <johnsel> weird

21:25 <d1b2> <johnsel> it says I'm using 140GB

21:25 <d1b2> <johnsel> well, 133GB of 140

21:25 <d1b2> <johnsel> but I have 28GB + 4x 8 when I look at vm level

21:26 <d1b2> <azonenberg> so that usage count seems to count vms that you arent running

21:26 <d1b2> <azonenberg> maybe you have some vm created but not running that is counting against your set cap

21:26 <d1b2> <johnsel> I don't

21:27 <d1b2> <azonenberg> huuuh

21:27 <d1b2> <johnsel> I counted including those that aren't running

21:27 <d1b2> <johnsel> that's why I'm confused

21:27 <d1b2> <johnsel> or maybe it is confused

21:27 <d1b2> <johnsel> either way something is off lol

21:27 <d1b2> <azonenberg> digs

21:28 <d1b2> <azonenberg> ok so i see ci_windows11 with 8GB, not running

21:29 <d1b2> <azonenberg> ci_windows11-cloudconfigtest_clone, 8gb, not running

21:29 <d1b2> <azonenberg> windows 11 cloudinit gpu, 8gb, running

21:29 <d1b2> <azonenberg> ci_linux_builder_gpu_58, 28gb, running

21:29 <d1b2> <azonenberg> ci_orchestrator, 8gb, running

21:29 <d1b2> <johnsel> yep

21:30 <d1b2> <azonenberg> so yeah i dont know why it's saying you're using 125 gigs lol

21:30 <d1b2> <johnsel> it's a problem though

21:30 <d1b2> <johnsel> I can't create anything new now

21:30 <d1b2> <johnsel> (this may even be the issue previously)

21:30 <d1b2> <azonenberg> Can you shut down all of your instances for a min so i can debug?

21:30 <d1b2> <azonenberg> everything even the orchestrator

21:31 <d1b2> <johnsel> done

21:31 <d1b2> <azonenberg> So it's still saying 125G RAM used in this resource set

21:32 <d1b2> <johnsel> yeah weird right

21:32 <d1b2> <johnsel> can you change how much is assigned?

21:33 <d1b2> <johnsel> maybe that'll trigger a refresh or something

21:33 <d1b2> <johnsel> hmmm

21:33 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1197292638987096296/vmstatus.png?ex=65babca9&is=65a847a9&hm=cf4c71820f213eddd3be8a38b113844e545dceecf2fcac5bfcc0ae2a3990d072&

21:33 <d1b2> <johnsel> that also shows 140GB used

21:33 <d1b2> <johnsel> so it is somehow counting the host usage?

21:33 <d1b2> <azonenberg> No

21:34 <d1b2> <azonenberg> It's also showing 94 vCPUs used

21:34 <d1b2> <azonenberg> https://github.com/vatesfr/xen-orchestra/issues/1552

21:34 <d1b2> <azonenberg> it seems like it counts all of your templates and snapshots against the cap

21:34 <d1b2> <johnsel> that's lame

21:34 <d1b2> <johnsel> I can't see them on my end properly either

21:35 <d1b2> <azonenberg> currently scheduled for xen orchestra v6

21:35 <d1b2> <azonenberg> anyawy, near term solution seems to be i allocate you way more ram than you can actually use

21:35 <d1b2> <johnsel> or cleaning up

21:35 <d1b2> <azonenberg> if it counts all of your templates and snapshots

21:35 <d1b2> <azonenberg> do you have unused templates we can get rid of?

21:35 <d1b2> <johnsel> can you list all my snapshots?

21:35 <d1b2> <azonenberg> i'm not sure about snapshots let's start with templates

21:35 <d1b2> <johnsel> and templates for that matter

21:36 <d1b2> <azonenberg> i see windows 10 64 bit, ci_windows11, centos 7, debian 11 cloudinit hub, debian 11 cloudinit gpu, ci_builder_gpu, windows 11 cloudinit gpu

21:36 <d1b2> <johnsel> sec, checking which ones we use

21:38 <d1b2> <johnsel> can you pull a download of those?

21:38 <d1b2> <azonenberg> i wouldn't delete them

21:38 <d1b2> <azonenberg> just remove from the self service set so they dont count against your usage

21:38 <d1b2> <johnsel> ooh right

21:38 <d1b2> <johnsel> that's a good idea

21:38 <d1b2> <azonenberg> can always put them back

21:38 <d1b2> <azonenberg> i have plenty of disk space ram is in shorter supply

21:39 <d1b2> <johnsel> in that case I think we actually use Debian 11 Cloud-Init (GPU) and Debian 11 Cloud-Init (Hub) right now

21:39 <d1b2> <azonenberg> and what on the windows side?

21:39 <d1b2> <johnsel> and then for Windows 11 let me see

21:39 <d1b2> <azonenberg> so ci_builder_gpu, windows10, and ci_windows11 can be removed?

21:39 <d1b2> <johnsel> let's remove them all

21:39 <d1b2> <azonenberg> and centos7?

21:39 <d1b2> <johnsel> I have a VM that we will convert to a template at some point

21:40 <d1b2> <johnsel> for win11

21:40 <d1b2> <johnsel> yes centos7 too

21:40 <d1b2> <azonenberg> aaand that didn't change usage at all

21:40 <d1b2> <azonenberg> good to do the cleanup but that wasn't a contributor

21:40 <d1b2> <johnsel> what the f

21:41 <d1b2> <johnsel> I deleted 2 vms

21:41 <d1b2> <johnsel> still using 109GB

21:41 <d1b2> <azonenberg> Now it says 109 GB

21:41 <d1b2> <azonenberg> thats down from 125

21:41 <d1b2> <azonenberg> so that did something

21:41 <d1b2> <johnsel> I also removed the runner

21:42 <d1b2> <azonenberg> now down to 81.19

21:42 <d1b2> <johnsel> Looks like we have 64 GB ghost usage

21:42 <d1b2> <azonenberg> Welp

21:42 <d1b2> <johnsel> or 48

21:42 <d1b2> <johnsel> no, 64

21:42 <d1b2> <johnsel> it must be something

21:42 <d1b2> <johnsel> perhaps a snapshot

21:42 <d1b2> <azonenberg> I bumped your cap up to 192GB which is enough for 128G of actual usage plus the ghost

21:42 <d1b2> <azonenberg> so that should fix it for now

21:43 <d1b2> <azonenberg> we can dig more into this later

21:43 <d1b2> <azonenberg> i have to get back to work

21:43 <d1b2> <azonenberg> in the meantime, restart all the nodes and test that it's at least working now?

21:43 <d1b2> <johnsel> yupyp

21:54 <d1b2> <johnsel> alright done, testing now

22:33 <d1b2> <azonenberg> All good?

22:38 <d1b2> <azonenberg> also are we only running debian in the test CI setup or do we have windows running on our hardware yet too?

22:38 <d1b2> <johnsel> 1. yes, 2. not yet

22:39 <d1b2> <azonenberg> Ok

22:39 <d1b2> <azonenberg> So how far are we from being ready to remove the github hosted ubuntu test runners?

22:39 <d1b2> <azonenberg> What if anything has to be done?

22:39 <d1b2> <johnsel> I think we're very near

22:39 <d1b2> <johnsel> Assuming the cycling keeps working now it should be done

22:39 <d1b2> <azonenberg> yeah

22:39 <d1b2> <azonenberg> And what's the blockers if any to getting windows running as well?

22:40 <d1b2> <johnsel> We need to do some cleanup on the template but that's not a lot of work

22:40 <d1b2> <azonenberg> Also, we still need to actually make the CI run the tests

22:40 <d1b2> <johnsel> Ah yes vulkan support may or may not work at this point

22:40 <d1b2> <azonenberg> It does not, i can tell you that now

22:40 <d1b2> <johnsel> We'll have to double check that

22:40 <d1b2> <azonenberg> because i see "glfw init failed" in the log when it tries to initialize vulkan while it's enumerating the list of available tests

22:41 <d1b2> <azonenberg> (this is itself a bug, we shouldn't init vulkan until we are ready to run a test)

22:41 <d1b2> <johnsel> yeah then we'll have to look into that

22:41 <d1b2> <azonenberg> the vulkan issue may be, and hopefully is, as simple as not having an x server running or $DISPLAY set right

22:41 <d1b2> <johnsel> we may need to set up a display

22:41 <d1b2> <johnsel> right

22:41 <d1b2> <johnsel> or set it to offline rendering

22:41 <d1b2> <johnsel> this should be possible

22:42 <d1b2> <azonenberg> setting up x is the simplest route imo, and will be needed down the road if we start adding unit tests for the ngscopeclient gui

22:42 <d1b2> <johnsel> I think that requires some code changes though

22:42 <d1b2> <azonenberg> e.g. simulated mouse clicks, testing rendering against golden outputs, etc

22:42 <d1b2> <johnsel> true

22:42 <d1b2> <azonenberg> That's where i want to end up long term

22:42 <d1b2> <azonenberg> so we should just do it

22:44 <d1b2> <azonenberg> wrt the glfw init failed bug, i intend to fix this soonish because it wastes time during the build

22:44 <d1b2> <azonenberg> But i'm holding off for now as it's a useful indicator the CI doesn't have working vulkan :p

22:44 <d1b2> <azonenberg> once we get to the point that we have working vulkan and are ready to actually run the tests in the CI environment, then i'll fix it

22:45 <d1b2> <johnsel> Yeah so setting Vulkan/X up on Debian and setting msys up on Windows remain as mayor tasks

22:45 <d1b2> <azonenberg> Yep

22:45 <d1b2> <azonenberg> That's not needed to be able to remove the existing ubuntu builder though

22:45 <d1b2> <azonenberg> as that doesnt have working vulkan either

22:45 <d1b2> <johnsel> True

22:45 <d1b2> <johnsel> It should just run now in theory

22:46 <d1b2> <azonenberg> so once we're at parity with that we can transition that to be the official linux CI (which i think we're at, we'll give it a little time to make sure the cycling is stable)

22:46 <d1b2> <azonenberg> then delete the ubuntu runner

22:46 <d1b2> <johnsel> Yep

22:46 <d1b2> <azonenberg> then work on getting vulkan working in the debian runner and getting the windows runner moved in house as well

22:46 <d1b2> <azonenberg> then we still have the x86 macos runner on github infrastructure as a stand-in for the eventual macos arm64 runner we need to figure out a plan for

22:47 <d1b2> <johnsel> yeah I think those are starting to be available on github too

22:48 <d1b2> <johnsel> I'd like to work on a ARM build too at some point

22:48 <d1b2> <johnsel> and I'd like to change the ci script so it doesn't use the template function to fill in the version number of vulkan

22:48 <d1b2> <david.rysk> ARM macOS? I can do some testing here if we're closer to working

22:48 <d1b2> <johnsel> now I can't copy paste the ci script to run it manually and get the same 'golden' result on my machine

22:48 <d1b2> <david.rysk> do you have a runner? need a mac mini or something

22:49 <d1b2> <johnsel> for me ARM linux, but

22:49 <d1b2> <johnsel> we also we want mac sure

22:49 <d1b2> <azonenberg> we do not currently have any arm ci

22:49 <d1b2> <azonenberg> this actually allowed a bug to escape not long ago

22:49 <d1b2> <azonenberg> where the build was broken on apple silicon because i accidentally un-ifdef'd an x86-ism

22:50 <d1b2> <azonenberg> and none of our CI caught it so i didn't know anything was broken until a mac user complained

22:51 <d1b2> <azonenberg> long term plan is mac mini or somebody's m1 macbook when they upgrade

22:51 <d1b2> <azonenberg> hosted at my place so it will be on the same sandbox network as the other CI stuff

22:51 <d1b2> <azonenberg> because one of the reasons we're moving CI in house is so that we can eventually do hardware-in-loop testing

22:52 <d1b2> <johnsel> yep we have a nice little setup now

22:52 <d1b2> <johnsel> I'm quite happy with how we integrated things

23:06 <d1b2> <johnsel> alright since we're looking at it anyway I'm taking a stab at fixing Vulkan as well

23:35 <d1b2> <johnsel> https://cdn.discordapp.com/attachments/776941750291267595/1197323282224005140/image.png?ex=65bad933&is=65a86433&hm=08360c60a39f398ea7cc04658f28cfe61b038ea1162232ddbe901cc181b46607&

23:35 <d1b2> <johnsel> that did not go easy

23:36 <d1b2> <johnsel> @azonenberg let's pull a template from this vm

23:38 <d1b2> <johnsel> please convert to a template @azonenberg https://xoa2.lab.poulsbo.antikernel.net/#/vms/21246595-547e-b463-95b1-3c7a5d7349a0/advanced

23:41 <d1b2> <johnsel> it looks like snapshots do indeed count for the resources

23:41 <d1b2> <johnsel> I think we have an orphaned snapshot somewhere