<_whitenotifier-3>
[scopehal-apps] azonenberg 40ac34d - Allow protocol overlays and spectrograms to be stacked regardless of which one is being dragged
bvernoux has joined #scopehal
_whitelogger has joined #scopehal
bvernoux has quit [Quit: Leaving]
<d1b2>
<johnsel> @azonenberg did you end up rebooting the server?
<d1b2>
<azonenberg> since we had the pcie passthru issues? no i was busy on something else last night
<d1b2>
<azonenberg> remind me later today and i can give it a try
<d1b2>
<david.rysk> A reboot will probably fix it; if you want more in-depth troubleshooting set me up with access and let me know 😛
<d1b2>
<azonenberg> Yes david's vpn access is also still pending
<d1b2>
<azonenberg> i have the certificate open on my CA system and have been too busy with other stuff to sign it and send back :p
<d1b2>
<david.rysk> also I have Windows CI working, and the CMake changes work on Windows pretty much verbatim. Have to fix up packaging and do some more cleanup
<d1b2>
<johnsel> you'd get only permissions on the XOA system, not the xen. Unless azonenberg wants to open up the whole system for you
<d1b2>
<david.rysk> I mean I can work with azonenberg to troubleshoot the PCIe stuff if he's not busy
<d1b2>
<david.rysk> but he's always busy :p
<d1b2>
<johnsel> but we agreed that we wouldn't do that between me & azonenberg
<d1b2>
<johnsel> sure, me too
<d1b2>
<johnsel> it's his lack of time that is the problem
<d1b2>
<johnsel> that said, whoever fixes it is fine for me
<d1b2>
<david.rysk> The hardware that I have wouldn't give us a real benefit over GH-hosted CI for the size of the project at the moment
<d1b2>
<johnsel> if you're more familiar with the xen backend
<d1b2>
<johnsel> I am not very, I have ran a local instance of XCP-ng for a while to know what is where but otherwise I am not
<d1b2>
<johnsel> Sure, and we want to set up actual hw in the near future
<d1b2>
<johnsel> I said from the beginning he should have bought a separate system that he can just give me KVM access too but that was not feasible
<d1b2>
<johnsel> it would have made things a whole lot easier
<d1b2>
<johnsel> we can also go the Docker route if you want to collaborate on that @david.rysk
<d1b2>
<david.rysk> We can, we just need a solid, working VM
<d1b2>
<johnsel> I initially wanted a single system (terraform) to handle both windows and linux
<d1b2>
<johnsel> and can eventually do the same for osx
<d1b2>
<david.rysk> Docker should be pretty straightforward, but I'm not sure how that interacts with GH Actions, is there a guide?
<d1b2>
<johnsel> there is an app
<d1b2>
<johnsel> let me link it
<d1b2>
<david.rysk> for macOS you'll need a Mac
<d1b2>
<david.rysk> Terraform from Hashicorp? Note potential licensing concerns
<d1b2>
<johnsel> and we don't publish it so whatever license they use is fine
<d1b2>
<david.rysk> no, it's shared-source under BSL
<d1b2>
<david.rysk> but I guess we'd be ok, still
<d1b2>
<johnsel> it's fine, nobody gets to interact with it
<d1b2>
<david.rysk> @azonenberg does xoa support nested virtualization in your environment?
<d1b2>
<johnsel> anyway yes their recommended autoscaler for docker would need a k8s backend
<d1b2>
<johnsel> it does
<d1b2>
<johnsel> but not with pcie passthrough
<d1b2>
<johnsel> and azonenberg barely knows anything about the ci haha
<d1b2>
<johnsel> I can give you access to the repo with the docs and scripts though
<d1b2>
<johnsel> I do insist on an infra as code setup, i.e. we should be able to recreate everything running from scripts without any manual provisioning
<d1b2>
<johnsel> but that's all dealt with in principle for the vms
<d1b2>
<johnsel> just need to add the k8s deployment
<d1b2>
<david.rysk> yeah but we need passthrough
<d1b2>
<johnsel> yes so we can't use nested virtualization
<d1b2>
<johnsel> but we don't need it either for docker
<d1b2>
<david.rysk> true
<d1b2>
<johnsel> I had wished we could do it with hyper-v
<d1b2>
<david.rysk> that's probably the way I'd go then
<d1b2>
<johnsel> that would have been great
<d1b2>
<david.rysk> I'm more used to kvm (e.g. Proxmox)
<d1b2>
<johnsel> yeah in theory xen can do kvm as well but we're fairly limited by the xoa interface
<d1b2>
<johnsel> it's an abstraction that implements the auth layer so andrew can have his private and work vms separated properly from our ci stuff
<d1b2>
<johnsel> that was a hard requirement
<d1b2>
<david.rysk> yeah and Andrew already uses xen
<d1b2>
<david.rysk> so using something else isn't an option
<d1b2>
<johnsel> which is why we went xcp-ng+xoa
<d1b2>
<johnsel> it was the closest to what he already had
<d1b2>
<johnsel> we considered other platforms but the GPU passthrough benefits weren't clear and it would have been a lot of work to port
<d1b2>
<johnsel> in retrospect now we're dealing with these esoteric issues it might have been the right choice but hindsight is 20/20
<d1b2>
<david.rysk> I'd probably only consider Proxmox as an alternative here
<d1b2>
<johnsel> yes esxi would have been ideal but I think we had some license limitations that made it not feasible
<d1b2>
<johnsel> in my experience it's vGPU support is fairly rock solid
<d1b2>
<johnsel> but we had both linux and windows w/ GPUs working previously so hopefully with a reboot we can just deploy them once and keep them running
<d1b2>
<johnsel> and accept that for now we can't have clean systems for every build
<d1b2>
<johnsel> at least for windows
<d1b2>
<johnsel> for linux things are a little simpler
<d1b2>
<johnsel> I used to manage weird embedded docker stuff back when I was still working full time
<d1b2>
<johnsel> basically a small datacenter on a NUC w/ full remote access and firmware update ability
<d1b2>
<johnsel> a whole fleet of them
<d1b2>
<johnsel> I was at some point working on a system that would work with a fixed pxe endpoint to boot from and then fully auto-provision itself on the fly
<d1b2>
<david.rysk> the vulkan SDK being set up might mess it up, and this might need my CMake work, I'll do more testing in the near future
<d1b2>
<david.rysk> but basically we can and should avoid the Vulkan SDK entirely when building with MinGW
<d1b2>
<johnsel> I think we had a lot of issues with glslang(?) missing on windows
<d1b2>
<david.rysk> since (1) MinGW includes all the needed packages and (2) C++ ABI is incompatible between MinGW and MSVC (which is what the Vulkan SDK is built with)
<d1b2>
<david.rysk> I ran into C++ linkage problems when I tried using my CMake changes, which make it actually try to link to the the SDK if the SDK is installed
<d1b2>
<david.rysk> which is expected as MinGW does not use VC++ C++ libs at all
<d1b2>
<david.rysk> the C++ ABI is just completely different
<d1b2>
<johnsel> Yeah it's clunky, we ideally wanted to drop msys2 entirely and build w/ msvc instead
<d1b2>
<david.rysk> for that I'd want to dump the requirements for gtkmm/cairomm
<d1b2>
<johnsel> that gives a much more expected environment for windows devs as well
<d1b2>
<david.rysk> since they complicate matters
<d1b2>
<johnsel> yep that's on the list
<d1b2>
<david.rysk> ffts too but ffts isn't a real problem (heh!)
<d1b2>
<johnsel> yes those 3 are annoying, the first 2 being real blockers
<d1b2>
<johnsel> we've (that is azonenberg mostly) have been working those dependencies out of the codebase
<d1b2>
<johnsel> with the goal to lose msys2 entirely eventually
<d1b2>
<david.rysk> I started looking at using fftw; fftw has some annoying bugs in its packaging (and hasn't had a release in too long) but I should be able to work around that
<d1b2>
<johnsel> I actually think we went from fftw to ffts
<d1b2>
<johnsel> or it was only considered at some time
<d1b2>
<david.rysk> azonenberg wants to use fftw for the tests
<d1b2>
<david.rysk> fftw is GPL which makes it undesireable for the entire project
<d1b2>
<johnsel> yeah I'm not sure why we want to keep that
<d1b2>
<johnsel> I guess to double check the output
<d1b2>
<david.rysk> otherwise I'd look at integrating RustFFT but that means writing some interfacing code (and needing a supported Rust compiler, but at this point all distros seem to have one)
<d1b2>
<johnsel> I mean the obvious choice is to just use vkFFT imo
<d1b2>
<david.rysk> yeah that's the idea, use vkFFT in the project, use something else for the tests
<d1b2>
<johnsel> personally I'd use vkfft everywhere
<d1b2>
<johnsel> they have enough tests on their end
<d1b2>
<johnsel> but that's just what I'd do
<d1b2>
<johnsel> I definitely wouldn't employ rust for it
<d1b2>
<johnsel> We're trying to lean down, not bulk up the build process
<d1b2>
<david.rysk> interestingly the build system stuff there is pretty mature/robust
<d1b2>
<johnsel> sure but it's also not developer friendly
<d1b2>
<johnsel> e.g. Mark would love to just have a project that he can open in visual studio to work on Windows support
<d1b2>
<johnsel> having lots of dependencies to set up the build process makes the barrier for people to contribute much higher
<d1b2>
<johnsel> e.g. I regularly want to do some work on a driver but then I have to go through the whole set up with msys2 etc and then I can't copy the commands from the docs and neither from the CI at this point
<d1b2>
<johnsel> so then I need to find the sources of the docs, but the latex syntax makes those annoying to copy paste as well
<d1b2>
<johnsel> by the time I have everything set up I've went through so much effort I don't even want to code anymore
<d1b2>
<david.rysk> yeah I have a WIP docs improvement that will come along with the CMake PR
<d1b2>
<johnsel> it should be as simple as git pull and replicate the ci commands
<d1b2>
<david.rysk> that's kinda the whole point of the CMake work I'm doing
<d1b2>
<david.rysk> fix all this
<d1b2>
<johnsel> I understand and I think it's great that you do. I just wanted to give my perspective on what I think is not ideal for the project
<d1b2>
<david.rysk> see what it should be is: go here and install the package (for CMake, etc)
<d1b2>
<david.rysk> then when you run CMake, it automatically finds everything that has been installed
<d1b2>
<david.rysk> but that will have to come with the MSVC overhaul
<d1b2>
<david.rysk> you will still need to run CMake, but at that point you can even run it from the GUI
<d1b2>
<johnsel> yeah but cmake I think is expected and fine
<d1b2>
<david.rysk> I'm also going to look at using vcpkg to handle pulling dependencies
<d1b2>
<david.rysk> (for MSVC)
<d1b2>
<johnsel> I think that's a good idea, I wanted to do the same
<d1b2>
<johnsel> I made a start on it
<d1b2>
<johnsel> I don't know if I have the files still
<d1b2>
<johnsel> I'll check
<d1b2>
<david.rysk> It doesn't make sense until we have the code working on MSVC with manual deps
<d1b2>
<johnsel> but yes that's a good choice I think then Windows people can use their expected Windows tools
<d1b2>
<david.rysk> anyway
<d1b2>
<david.rysk> did you look at my MSYS CI config?
<d1b2>
<johnsel> No I went through that to figure out where the build fails on
<d1b2>
<david.rysk> the simplified MSYS instructions are literally: install MSYS2 install list of packages cmake .. make
<d1b2>
<johnsel> I like that, needs vulkan still
<d1b2>
<david.rysk> it's pulling vulkan from the MSYS2 repos
<d1b2>
<johnsel> Please make it copy-pasteable though
<d1b2>
<johnsel> oh great
<d1b2>
<johnsel> then I love it
<d1b2>
<david.rysk> I'm not sure I follow how these are not copy-pasteable?
<d1b2>
<david.rysk> oh I'm not even using env.VULKAN_SDK_VERSION in this one
<d1b2>
<johnsel> I thought it needed Vulkan still
<d1b2>
<johnsel> which I thought you'd implement with that annoying template var
<d1b2>
<johnsel> I really hate those with passion.
<d1b2>
<david.rysk> I'm not sure I follow why
<d1b2>
<johnsel> But this is great yes
<d1b2>
<johnsel> Well the CI system produces essentially the golden copy of the application on a certain platform
<d1b2>
<johnsel> Ideally that means that a developer on that platform can just copy the commands from the CI script locally
<d1b2>
<johnsel> I think it's a mentality difference between software people and EEs who are used to look at documentation for what to do
<d1b2>
<david.rysk> imo the CI scripts should not be the documentation
<d1b2>
<johnsel> For cloud stuff it's expected that you can just run the CI commands locally and get the same result
<d1b2>
<david.rysk> and it's our job to update the documentation to match the CI scripts
<d1b2>
<johnsel> Like I said, I think that's a mentality difference. Where I come from the CI builds are the golden copy and you want to be able to replicate them. WIthout need to refer to documentation
<d1b2>
<david.rysk> there are too many places where you need to use these env vars 😦
<d1b2>
<johnsel> So in essence they are documentation for the build process
<d1b2>
<johnsel> I'll see if you can export them once or so and at least minimize the need to modify a script
<d1b2>
<david.rysk> I guess you can do that in some places
<d1b2>
<johnsel> Dinner is ready though so it will have to wait until after
<d1b2>
<david.rysk> but like, for msys to properly do matrix build for all the configs, I need to make the list of packages have
<d1b2>
<david.rysk> e.g. mingw-w64-${{ matrix.env }}-x86_64-toolchain
<d1b2>
<david.rysk> or I can use pacboy
<d1b2>
<johnsel> I mean you don't have to
<d1b2>
<david.rysk> but then the user will need to know to install pacboy
<d1b2>
<johnsel> Well I'd love to discuss further after my dinner. But imo verbosity in these scripts is only desirable
<d1b2>
<david.rysk> sure
<d1b2>
<david.rysk> we probably want to support windows-on-ARM too...
<d1b2>
<david.rysk> which means we'd need a device for that 😛
<d1b2>
<david.rysk> also regarding Rust, there's this CMake tool called Corrosion that handles the entire "build lib into .dll/.so/.dylib" part for you
<d1b2>
<david.rysk> but yeah that's lower priority
<d1b2>
<azonenberg> We have had bugs in the past in which incorrect usage of vkFFT (e.g. padding settings) caused garbage output
<d1b2>
<azonenberg> I want a golden FFT implementation in the unit tests to catch such bugs
<d1b2>
<david.rysk> seems the go-to device is the $699 Windows Dev Kit 2023, but it's SLOW
<d1b2>
<azonenberg> and that implementation being GPL is a non-issue
<d1b2>
<azonenberg> Hence looking at FFTW
<d1b2>
<david.rysk> a VM on a Mac will be significantly better performing
<d1b2>
<azonenberg> vkFFT for all actual library/application code is a given
<d1b2>
<azonenberg> @johnsel just got a day-old debian ci build failure
<d1b2>
<david.rysk> yeah I've been getting the same, I have abandoned the selfhosted CI for now, until we can get it more stable
<d1b2>
<david.rysk> IMO we should just have one VM and use lightweight containers (docker) inside
<d1b2>
<david.rysk> though then we have to see about GPU passthru
<d1b2>
<david.rysk> docker claims to have some sort of GPU passthru support